You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by petermaxlee <gi...@git.apache.org> on 2016/07/28 07:37:31 UTC

[GitHub] spark pull request #14389: [SPARK-16714][SQL] map, struct function should ac...

GitHub user petermaxlee opened a pull request:

    https://github.com/apache/spark/pull/14389

    [SPARK-16714][SQL] map, struct function should accept decimals with different precision/scale

    ## What changes were proposed in this pull request?
    This patch changes the type coercion rule for map and struct functions so they can accept decimals of different precision/scale. This is not a regression from Spark 1.x, but it is a much bigger problem in Spark 2.0 because floating point literals are parsed as decimals.
    
    ## How was this patch tested?
    Created a new end-to-end test suite SQLTypeCoercionSuite. In the future we can move all other type checking tests here. I first tried adding a test in SQLQuerySuite but the suite was clearly already too large.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/petermaxlee/spark SPARK-16714

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14389.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14389
    
----
commit 9774605c6be7bc5f41267c37f66567a498ea9156
Author: petermaxlee <pe...@gmail.com>
Date:   2016-07-28T07:34:39Z

    [SPARK-16714][SQL] map, struct function should accept decimals with different precision/scale

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    @petermaxlee , after your patch, `findTightestCommonTypeOfTwo` is only used by `BinaryOperator` and `JSON schema inference`. I checked all the `BinaryOperator` implementations, except for something like `And`, `BitwiseOr` that don't accept decimal types, all other implementations will be handled in `DecimalPrecision`. This means, there is no expression needs the semantic of `findTightestCommonTypeOfTwo`: find the tightest common type of two types without precision loss. Can you check about `JSON schema inference`? If it doesn't need this semantic either, I think we can safely remove it and only use `findWiderTypeForTwo`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14389: [SPARK-16714][SQL] map, array function should acc...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14389#discussion_r72579629
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -440,7 +440,7 @@ object TypeCoercion {
     
           case a @ CreateArray(children) if !haveSameType(children) =>
             val types = children.map(_.dataType)
    -        findTightestCommonTypeAndPromoteToString(types) match {
    +        findWiderCommonType(types) match {
    --- End diff --
    
    Does hive allow precision loose for this case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14389: [SPARK-16714][SQL] Refactor type widening for con...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14389#discussion_r72755309
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -99,45 +99,13 @@ object TypeCoercion {
         case _ => None
       }
     
    -  /** Similar to [[findTightestCommonType]], but can promote all the way to StringType. */
    -  def findTightestCommonTypeToString(left: DataType, right: DataType): Option[DataType] = {
    -    findTightestCommonTypeOfTwo(left, right).orElse((left, right) match {
    -      case (StringType, t2: AtomicType) if t2 != BinaryType && t2 != BooleanType => Some(StringType)
    -      case (t1: AtomicType, StringType) if t1 != BinaryType && t1 != BooleanType => Some(StringType)
    -      case _ => None
    -    })
    -  }
    -
    -  /**
    -   * Similar to [[findTightestCommonType]], if can not find the TightestCommonType, try to use
    -   * [[findTightestCommonTypeToString]] to find the TightestCommonType.
    -   */
    -  private def findTightestCommonTypeAndPromoteToString(types: Seq[DataType]): Option[DataType] = {
    --- End diff --
    
    this is no longer used


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    to refactor the type widen rules, i.e. we only need 2 rules: `findWiderType` and `findWiderTypeAndPromoteToString`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    I was looking at the code, and I think this is a more general problem with decimal widening. The same problem exists for least, and other functions.
    
    ```
    scala> sql("select least(0.1, 0.01)").collect()
    org.apache.spark.sql.AnalysisException: cannot resolve 'least(CAST(0.1 AS DECIMAL(1,1)), CAST(0.01 AS DECIMAL(2,2)))' due to data type mismatch: The expressions should all have the same type, got LEAST (ArrayBuffer(DecimalType(1,1), DecimalType(2,2))).; line 1 pos 7
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, struct function should accept de...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    This should resolve the following two pull requests as well:
    
    https://github.com/apache/spark/pull/14353
    https://github.com/apache/spark/pull/14374


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #63001 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63001/consoleFull)** for PR 14389 at commit [`071b01d`](https://github.com/apache/spark/commit/071b01d8a0cabefe3f016f7de6fa91e23821c514).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #63007 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63007/consoleFull)** for PR 14389 at commit [`b9f94fe`](https://github.com/apache/spark/commit/b9f94febaf0abbee95caab1fa7d7eed4e6748382).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, struct function should accept de...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Could you add my testcase here, too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #3196 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3196/consoleFull)** for PR 14389 at commit [`b9f94fe`](https://github.com/apache/spark/commit/b9f94febaf0abbee95caab1fa7d7eed4e6748382).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #63007 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63007/consoleFull)** for PR 14389 at commit [`b9f94fe`](https://github.com/apache/spark/commit/b9f94febaf0abbee95caab1fa7d7eed4e6748382).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    FYI, I had a look before. Its map, array, greatest and least. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14389: [SPARK-16714][SQL] map, array function should acc...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14389#discussion_r72577751
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLTypeCoercionSuite.scala ---
    @@ -0,0 +1,44 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql
    +
    +import java.math.BigDecimal
    +
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +/**
    + * End-to-end tests for type coercion.
    + */
    +class SQLTypeCoercionSuite extends QueryTest with SharedSQLContext {
    +
    +  test("SPARK-16714 decimal in map and struct") {
    --- End diff --
    
    i had a mistake here about the naming. will fix it later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    And it's consistent to handle Nvl and Nvl2 the same way we handle other functions of similar kind (e.g. coalesce).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    One problem of decimal type in Spark SQL is, the wider type of 2 decimal types may be illegal(exceed system limitation), then we have to truncate and suffer precision lose. This forces us to make decisions about which functions can accept precision lose and which can not.
    
    Unfortunately, this is not a common problem(e.g. MySQL and Postgres don't have this problem) so we don't have many similar systems to compare and follow.
    
    MySQL's decimal type's max scale is half of the max precision, so the wider type of 2 decimal types in MySQL will never exceed system limitation.
    Postgres has a kind of unlimited decimal type, so it doesn't have this problem at all.
    
    I think MySQL's design is a good one to follow, cc @rxin @marmbrus @yhuai  what do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    This depends on what we want for greatest/least. These two expressions do support string type as input.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14389

currently we have 3 different semantics for finding a wider type:

1. `findTightestCommonType`: try to find a wider type, but skip decimal types
2. `findTightestCommonTypeToString`: similar to rule 1, but if one side is string, it can promote the other side to string
3. `findWiderCommonType`: similar to rule 2, but handles decimal type, and truncate it if necessary.

It makes sense to have 2 different semantics for string promotion, however, I don't think we need 2 different semantics for decimal types. There is no such a function that need its arguments to be same type, but can't accept precision loss for decimal type. I think it's better to run the query and log a warning message about the truncation, rather than fail the query.

We only need 2 semantics:

1. `findWiderType`: try to find the wider type, including decimal type, and truncate if necessary
2. `findWiderTypeAndPromoteToString`: similar to rule 1, but handles string promotion.

We also need to add some checks before applying the type widen rules, to avoid conflicting with `DecimalPrecision`, which defines some special rules for binary arithmetic about decimal type.

@petermaxlee in your PR, the string promotion semantic is hidden in `findWiderType`, and makes `greatest` and `least` accept string promotion, which is not expected. What do you think?

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14389: [SPARK-16714][SQL] Refactor type widening for con...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14389#discussion_r72773724
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -99,45 +99,13 @@ object TypeCoercion {
         case _ => None
       }
     
    -  /** Similar to [[findTightestCommonType]], but can promote all the way to StringType. */
    -  def findTightestCommonTypeToString(left: DataType, right: DataType): Option[DataType] = {
    -    findTightestCommonTypeOfTwo(left, right).orElse((left, right) match {
    -      case (StringType, t2: AtomicType) if t2 != BinaryType && t2 != BooleanType => Some(StringType)
    -      case (t1: AtomicType, StringType) if t1 != BinaryType && t1 != BooleanType => Some(StringType)
    -      case _ => None
    -    })
    -  }
    -
    -  /**
    -   * Similar to [[findTightestCommonType]], if can not find the TightestCommonType, try to use
    -   * [[findTightestCommonTypeToString]] to find the TightestCommonType.
    -   */
    -  private def findTightestCommonTypeAndPromoteToString(types: Seq[DataType]): Option[DataType] = {
    -    types.foldLeft[Option[DataType]](Some(NullType))((r, c) => r match {
    -      case None => None
    -      case Some(d) =>
    -        findTightestCommonTypeToString(d, c)
    -    })
    -  }
    -
    -  /**
    -   * Find the tightest common type of a set of types by continuously applying
    -   * `findTightestCommonTypeOfTwo` on these types.
    -   */
    -  private def findTightestCommonType(types: Seq[DataType]): Option[DataType] = {
    -    types.foldLeft[Option[DataType]](Some(NullType))((r, c) => r match {
    -      case None => None
    -      case Some(d) => findTightestCommonTypeOfTwo(d, c)
    -    })
    -  }
    -
       /**
        * Case 2 type widening (see the classdoc comment above for TypeCoercion).
        *
        * i.e. the main difference with [[findTightestCommonTypeOfTwo]] is that here we allow some
    --- End diff --
    
    also mention the string promotion here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, struct function should accept de...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #62958 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62958/consoleFull)** for PR 14389 at commit [`9774605`](https://github.com/apache/spark/commit/9774605c6be7bc5f41267c37f66567a498ea9156).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    @HyukjinKwon FWIW I don't think it makes sense to handle everything except greatest/least. It also does not make sense to automatically coerce from decimal to double type for these two functions.
    
    The decimal truncation problem described in SPARK-16646 seems orthogonal to this. It would be better if we don't need to worry about truncation (e.g. with unlimited decimal or with precision always double the size of scale), but I don't think that should impact what type greatest/least use.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    for `greatest(1, '2')`, Hive, MySQL and Postgres will turn the string into double, instead of promoting the integer to string. This is the same behaviour with most of the arithmetic expressions, e.g. `Add`, `Minus`, `Divide`, etc. I think it makes sense and Spark SQL should follow it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14389: [SPARK-16714][SQL] Refactor type widening for con...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14389#discussion_r72881850
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -99,45 +99,13 @@ object TypeCoercion {
         case _ => None
       }
     
    -  /** Similar to [[findTightestCommonType]], but can promote all the way to StringType. */
    -  def findTightestCommonTypeToString(left: DataType, right: DataType): Option[DataType] = {
    -    findTightestCommonTypeOfTwo(left, right).orElse((left, right) match {
    -      case (StringType, t2: AtomicType) if t2 != BinaryType && t2 != BooleanType => Some(StringType)
    -      case (t1: AtomicType, StringType) if t1 != BinaryType && t1 != BooleanType => Some(StringType)
    -      case _ => None
    -    })
    -  }
    -
    -  /**
    -   * Similar to [[findTightestCommonType]], if can not find the TightestCommonType, try to use
    -   * [[findTightestCommonTypeToString]] to find the TightestCommonType.
    -   */
    -  private def findTightestCommonTypeAndPromoteToString(types: Seq[DataType]): Option[DataType] = {
    -    types.foldLeft[Option[DataType]](Some(NullType))((r, c) => r match {
    -      case None => None
    -      case Some(d) =>
    -        findTightestCommonTypeToString(d, c)
    -    })
    -  }
    -
    -  /**
    -   * Find the tightest common type of a set of types by continuously applying
    -   * `findTightestCommonTypeOfTwo` on these types.
    -   */
    -  private def findTightestCommonType(types: Seq[DataType]): Option[DataType] = {
    -    types.foldLeft[Option[DataType]](Some(NullType))((r, c) => r match {
    -      case None => None
    -      case Some(d) => findTightestCommonTypeOfTwo(d, c)
    -    })
    -  }
    -
       /**
        * Case 2 type widening (see the classdoc comment above for TypeCoercion).
        *
        * i.e. the main difference with [[findTightestCommonTypeOfTwo]] is that here we allow some
    --- End diff --
    
    done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #63001 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63001/consoleFull)** for PR 14389 at commit [`071b01d`](https://github.com/apache/spark/commit/071b01d8a0cabefe3f016f7de6fa91e23821c514).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14389: [SPARK-16714][SQL] Refactor type widening for con...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14389#discussion_r72755318
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -99,45 +99,13 @@ object TypeCoercion {
         case _ => None
       }
     
    -  /** Similar to [[findTightestCommonType]], but can promote all the way to StringType. */
    -  def findTightestCommonTypeToString(left: DataType, right: DataType): Option[DataType] = {
    -    findTightestCommonTypeOfTwo(left, right).orElse((left, right) match {
    -      case (StringType, t2: AtomicType) if t2 != BinaryType && t2 != BooleanType => Some(StringType)
    -      case (t1: AtomicType, StringType) if t1 != BinaryType && t1 != BooleanType => Some(StringType)
    -      case _ => None
    -    })
    -  }
    -
    -  /**
    -   * Similar to [[findTightestCommonType]], if can not find the TightestCommonType, try to use
    -   * [[findTightestCommonTypeToString]] to find the TightestCommonType.
    -   */
    -  private def findTightestCommonTypeAndPromoteToString(types: Seq[DataType]): Option[DataType] = {
    -    types.foldLeft[Option[DataType]](Some(NullType))((r, c) => r match {
    -      case None => None
    -      case Some(d) =>
    -        findTightestCommonTypeToString(d, c)
    -    })
    -  }
    -
    -  /**
    -   * Find the tightest common type of a set of types by continuously applying
    -   * `findTightestCommonTypeOfTwo` on these types.
    -   */
    -  private def findTightestCommonType(types: Seq[DataType]): Option[DataType] = {
    --- End diff --
    
    this is no longer used.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #63034 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63034/consoleFull)** for PR 14389 at commit [`ffd1734`](https://github.com/apache/spark/commit/ffd17346f48715aad09280801bffefd1eb30256b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14389: [SPARK-16714][SQL] map, array function should acc...

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14389#discussion_r72580646
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -440,7 +440,7 @@ object TypeCoercion {
     
           case a @ CreateArray(children) if !haveSameType(children) =>
             val types = children.map(_.dataType)
    -        findTightestCommonTypeAndPromoteToString(types) match {
    +        findWiderCommonType(types) match {
    --- End diff --
    
    Oh wait! sorry, I will correct the example soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    I don't believe the remaining use cases of findTightestCommonTypeOfTwo are necessary. That said, to get rid of those use cases would require more refactoring. I'm assuming we want to fix this issue in 2.0.1, so perhaps it is best to do the refactoring separately only for the master branch.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63001/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    @petermaxlee . Yep. I deleted my request, but you had better have a test case with real columns on table data. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Which part do you want me to update? I thought you've already committed the changes needed. Let me know and I will update this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    I ran the Python tests locally and they passed. It looks like the failure was caused by the Jenkins shutdown.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    cc @cloud-fan and @liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62999/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #63034 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63034/consoleFull)** for PR 14389 at commit [`ffd1734`](https://github.com/apache/spark/commit/ffd17346f48715aad09280801bffefd1eb30256b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Have you read this comment?
    
    >We are discussing this internally, can you hold it for a while? We may decide to increase the max precision to 76 and keep max scale as 38, then we don't have this problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #62958 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62958/consoleFull)** for PR 14389 at commit [`9774605`](https://github.com/apache/spark/commit/9774605c6be7bc5f41267c37f66567a498ea9156).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Make function type coercion more cons...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #63000 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63000/consoleFull)** for PR 14389 at commit [`929c39f`](https://github.com/apache/spark/commit/929c39ffae1a860c39216041cc994d6e577cc6e5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62958/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #62999 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62999/consoleFull)** for PR 14389 at commit [`afca003`](https://github.com/apache/spark/commit/afca0032e5f1365fdd6face07cedec44bb000e25).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14389: [SPARK-16714][SQL] Refactor type widening for con...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14389#discussion_r72755300
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -99,45 +99,13 @@ object TypeCoercion {
         case _ => None
       }
     
    -  /** Similar to [[findTightestCommonType]], but can promote all the way to StringType. */
    -  def findTightestCommonTypeToString(left: DataType, right: DataType): Option[DataType] = {
    --- End diff --
    
    I inlined this into findWiderTypeForTwo


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    @dongjoon-hyun You only had one test case didn't you? I don't think that test case is useful since it was testing specifically checkInputDataTypes, which was not the right thing to test. Type coercion should be handled by the analyzer, not the expression's type checking.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    hi @petermaxlee , are you going to update this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14389: [SPARK-16714][SQL] map, array function should acc...

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14389#discussion_r72580777
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -440,7 +440,7 @@ object TypeCoercion {
     
           case a @ CreateArray(children) if !haveSameType(children) =>
             val types = children.map(_.dataType)
    -        findTightestCommonTypeAndPromoteToString(types) match {
    +        findWiderCommonType(types) match {
    --- End diff --
    
    I fixed the example. It seems the precision is being truncated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Make function type coercion more cons...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    I've pushed a new change to make it consistent for all the instances that I could find. I believe the new one has more consistent behavior across functions and simpler to understand.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14389: [SPARK-16714][SQL] Refactor type widening for con...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee closed the pull request at:

    https://github.com/apache/spark/pull/14389


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63034/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Make function type coercion more cons...

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    @petermaxlee At least for `least` and `greatest`, please refer https://issues.apache.org/jira/browse/SPARK-16646. It seems we should hold it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63000/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #3196 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3196/consoleFull)** for PR 14389 at commit [`b9f94fe`](https://github.com/apache/spark/commit/b9f94febaf0abbee95caab1fa7d7eed4e6748382).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    and can you make sure that it's safe to use `findWiderTypeForTwo` for `Nvl` and `Nvl2`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    OK that makes sense!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Sorry that it has taken this long. I have submitted a work in progress pull request at https://github.com/apache/spark/pull/14696
    
    Going to close this one and continue the work there, since it is a fairly different pull request.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #62999 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62999/consoleFull)** for PR 14389 at commit [`afca003`](https://github.com/apache/spark/commit/afca0032e5f1365fdd6face07cedec44bb000e25).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63007/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Yea, for least and greatest, I opened this here https://github.com/apache/spark/pull/14294. Actually, I am worried if allowing lose of precision and fractions is okay. 
    
    I first thought this should only allow widening within the range that it does not lose some values but it seems some think it should be just truncated and Hive does this by always falling back to double.
    
    Please refer https://issues.apache.org/jira/browse/SPARK-16646.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    **[Test build #63000 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63000/consoleFull)** for PR 14389 at commit [`929c39f`](https://github.com/apache/spark/commit/929c39ffae1a860c39216041cc994d6e577cc6e5).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by petermaxlee <gi...@git.apache.org>.

Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    I examined the usage of various type coercion rules, and here are how they are used:
    
    ```
    
    findTightestCommonTypeOfTwo
    
    - BinaryOperator
    - IfNull
    - NullIf
    - Nvl2
    - JSON schema inference
    
    findTightestCommonTypeToString
    
    - Nvl
    
    findTightestCommonTypeAndPromoteToString
    
    - CreateArray
    - CreateMap
    
    findTightestCommonType
    
    - Greatest
    - Least
    
    findWiderTypeForTwo
    
    - IfCoercion
    
    findWiderCommonType
    
    - WidenSetOperationTypes
    - InConversion
    - Coalesce
    - CaseWhenCoercion
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, array function should accept dec...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    If we want to just fix these bugs, I think we should come up with a list about which functions(need arguments of same type) can accept precision lose and which can not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] map, struct function should accept de...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Also, please add `array` into the title and description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14389: [SPARK-16714][SQL] map, array function should acc...

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14389#discussion_r72580413
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ---
    @@ -440,7 +440,7 @@ object TypeCoercion {
     
           case a @ CreateArray(children) if !haveSameType(children) =>
             val types = children.map(_.dataType)
    -        findTightestCommonTypeAndPromoteToString(types) match {
    +        findWiderCommonType(types) match {
    --- End diff --
    
    In current master, yes it seems so,
    
    ```
    hive> SELECT array(1, 1.5, 1.0000000000000005);
    OK
    [1.0,1.5,1.0000000000000004]
    Time taken: 0.063 seconds, Fetched: 1 row(s)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14389: [SPARK-16714][SQL] Refactor type widening for consistenc...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14389
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org