You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by petermaxlee <gi...@git.apache.org> on 2016/07/26 07:39:31 UTC

[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

GitHub user petermaxlee opened a pull request:

    https://github.com/apache/spark/pull/14364

    [SPARK-16730][SQL] Implement function aliases for type casts

    ## What changes were proposed in this pull request?
    Spark 1.x supports using the Hive type name as function names for doing casts, e.g.
    ```sql
    SELECT int(1.0);
    SELECT string(2.0);
    ```
    
    The above query would work in Spark 1.x because Spark 1.x fail back to Hive for unimplemented functions, and break in Spark 2.0 because the fall back was removed.
    
    This patch implements function aliases using an analyzer rule for the following cast functions:
    - boolean
    - tinyint
    - smallint
    - int
    - bigint
    - float
    - double
    - decimal
    - date
    - timestamp
    - binary
    - string
    
    ## How was this patch tested?
    Added end-to-end tests in SQLCompatibilityFunctionSuite.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/petermaxlee/spark SPARK-16730-2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14364.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14364
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    **[Test build #62902 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62902/consoleFull)** for PR 14364 at commit [`b8fbcab`](https://github.com/apache/spark/commit/b8fbcab1d5bf78f10b0edce2a1011080a38f4fc6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62872/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72379526
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -408,8 +409,21 @@ object FunctionRegistry {
         expression[BitwiseAnd]("&"),
         expression[BitwiseNot]("~"),
         expression[BitwiseOr]("|"),
    -    expression[BitwiseXor]("^")
    -
    +    expression[BitwiseXor]("^"),
    +
    +    // Cast aliases (SPARK-16730)
    +    castAlias("boolean", BooleanType),
    +    castAlias("tinyint", ByteType),
    +    castAlias("smallint", ShortType),
    +    castAlias("int", IntegerType),
    +    castAlias("bigint", LongType),
    --- End diff --
    
    ok agree


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72563566
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -452,14 +466,37 @@ object FunctionRegistry {
           }
         }
     
    -    val clazz = tag.runtimeClass
    +    (name, (expressionInfo[T](name), builder))
    +  }
    +
    +  /**
    +   * Creates a function registry lookup entry for cast aliases (SPARK-16730).
    +   * For example, if name is "int", and dataType is IntegerType, this means int(x) would become
    +   * an alias for cast(x as IntegerType).
    +   * See usage above.
    +   */
    +  private def castAlias(
    +      name: String,
    +      dataType: DataType): (String, (ExpressionInfo, FunctionBuilder)) = {
    +    val builder = (args: Seq[Expression]) => {
    +      if (args.size != 1) {
    +        throw new AnalysisException(s"Function $name accepts only one argument")
    +      }
    +      Cast(args.head, dataType)
    +    }
    +    (name, (expressionInfo[Cast](name), builder))
    --- End diff --
    
    so whatever cast function we describe, we will always show `Cast`'s description right? Is it same with hive?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    mostly LGTM, thanks for working on it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62928/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    A few minor comments. LGTM otherwise.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72379046
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -408,8 +409,21 @@ object FunctionRegistry {
         expression[BitwiseAnd]("&"),
         expression[BitwiseNot]("~"),
         expression[BitwiseOr]("|"),
    -    expression[BitwiseXor]("^")
    -
    +    expression[BitwiseXor]("^"),
    +
    +    // Cast aliases (SPARK-16730)
    +    castAlias("boolean", BooleanType),
    +    castAlias("tinyint", ByteType),
    +    castAlias("smallint", ShortType),
    +    castAlias("int", IntegerType),
    +    castAlias("bigint", LongType),
    --- End diff --
    
    use `LongType.simpleString` instead of `bigint` looks better. Same to others.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by petermaxlee <gi...@git.apache.org>.
Github user petermaxlee closed the pull request at:

    https://github.com/apache/spark/pull/14364


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    **[Test build #62928 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62928/consoleFull)** for PR 14364 at commit [`5bbe995`](https://github.com/apache/spark/commit/5bbe995fba0a3355b0edc795bd7e9188e6227188).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72379516
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLCompatibilityFunctionSuite.scala ---
    @@ -69,4 +73,25 @@ class SQLCompatibilityFunctionSuite extends QueryTest with SharedSQLContext {
           sql("SELECT nvl2(null, 1, 2.1d), nvl2('n', 1, 2.1d)"),
           Row(2.1, 1.0))
       }
    +
    +  test("SPARK-16730 cast alias functions for Hive compatibility") {
    +    checkAnswer(
    +      sql("SELECT boolean(1), tinyint(1), smallint(1), int(1), bigint(1)"),
    +      Row(true, 1.toByte, 1.toShort, 1, 1L))
    +
    +    checkAnswer(
    +      sql("SELECT float(1), double(1), decimal(1)"),
    +      Row(1.toFloat, 1.0, new BigDecimal(1)))
    +
    +    checkAnswer(
    +      sql("SELECT date(\"2014-04-04\"), timestamp(date(\"2014-04-04\"))"),
    +      Row(new java.util.Date(114, 3, 4), new Timestamp(114, 3, 4, 0, 0, 0, 0)))
    +
    +    checkAnswer(
    +      sql("SELECT string(1)"),
    +      Row("1"))
    +
    +    // Error handling: only one argument
    +    assert(intercept[AnalysisException](sql("SELECT string(1, 2)")).getMessage.contains("one arg"))
    --- End diff --
    
    how about we use the full error message here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62902/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by petermaxlee <gi...@git.apache.org>.
Github user petermaxlee commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72204545
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -408,8 +409,21 @@ object FunctionRegistry {
         expression[BitwiseAnd]("&"),
         expression[BitwiseNot]("~"),
         expression[BitwiseOr]("|"),
    -    expression[BitwiseXor]("^")
    -
    +    expression[BitwiseXor]("^"),
    +
    +    // Cast aliases (SPARK-16730)
    +    castAlias("boolean", BooleanType),
    --- End diff --
    
    boolean is just a normal function in Hive (same as for example acos), so it would do whatever a normal function's behavior is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    **[Test build #62872 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62872/consoleFull)** for PR 14364 at commit [`8b087b2`](https://github.com/apache/spark/commit/8b087b2960071ff4c273b8d01c694fbfd4fac199).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72391683
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -452,14 +466,36 @@ object FunctionRegistry {
           }
         }
     
    -    val clazz = tag.runtimeClass
    +    (name, (expressionInfo[T](name), builder))
    +  }
    +
    +  /**
    +   * Creates a function lookup registry for cast aliases (SPARK-16730).
    +   * For example, if name is "int", and dataType is IntegerType, this means int(x) would become
    +   * an alias for cast(x as IntegerType).
    +   * See usage above.
    +   */
    +  private def castAlias(name: String, dataType: DataType)
    --- End diff --
    
    NIT style: put every argument on a different line, e.g.:
    ```scala
    private def castAlias(
        name: String,
        dataType: DataType): (String, (ExpressionInfo, FunctionBuilder)) = { ...
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by petermaxlee <gi...@git.apache.org>.
Github user petermaxlee commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72380101
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLCompatibilityFunctionSuite.scala ---
    @@ -69,4 +73,25 @@ class SQLCompatibilityFunctionSuite extends QueryTest with SharedSQLContext {
           sql("SELECT nvl2(null, 1, 2.1d), nvl2('n', 1, 2.1d)"),
           Row(2.1, 1.0))
       }
    +
    +  test("SPARK-16730 cast alias functions for Hive compatibility") {
    +    checkAnswer(
    +      sql("SELECT boolean(1), tinyint(1), smallint(1), int(1), bigint(1)"),
    +      Row(true, 1.toByte, 1.toShort, 1, 1L))
    +
    +    checkAnswer(
    +      sql("SELECT float(1), double(1), decimal(1)"),
    +      Row(1.toFloat, 1.0, new BigDecimal(1)))
    +
    +    checkAnswer(
    +      sql("SELECT date(\"2014-04-04\"), timestamp(date(\"2014-04-04\"))"),
    +      Row(new java.util.Date(114, 3, 4), new Timestamp(114, 3, 4, 0, 0, 0, 0)))
    +
    +    checkAnswer(
    +      sql("SELECT string(1)"),
    +      Row("1"))
    +
    +    // Error handling: only one argument
    +    assert(intercept[AnalysisException](sql("SELECT string(1, 2)")).getMessage.contains("one arg"))
    --- End diff --
    
    fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by petermaxlee <gi...@git.apache.org>.
Github user petermaxlee commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72565468
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -452,14 +466,37 @@ object FunctionRegistry {
           }
         }
     
    -    val clazz = tag.runtimeClass
    +    (name, (expressionInfo[T](name), builder))
    +  }
    +
    +  /**
    +   * Creates a function registry lookup entry for cast aliases (SPARK-16730).
    +   * For example, if name is "int", and dataType is IntegerType, this means int(x) would become
    +   * an alias for cast(x as IntegerType).
    +   * See usage above.
    +   */
    +  private def castAlias(
    +      name: String,
    +      dataType: DataType): (String, (ExpressionInfo, FunctionBuilder)) = {
    +    val builder = (args: Seq[Expression]) => {
    +      if (args.size != 1) {
    +        throw new AnalysisException(s"Function $name accepts only one argument")
    +      }
    +      Cast(args.head, dataType)
    +    }
    +    (name, (expressionInfo[Cast](name), builder))
    --- End diff --
    
    Yes - this is a limitation. That's not what Hive does because Hive actually does not have a single cast expression. It has a cast expression for each target type. I think it's a pretty small detail and fixing it would require a lot of work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62909/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by petermaxlee <gi...@git.apache.org>.
Github user petermaxlee commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72379517
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -408,8 +409,21 @@ object FunctionRegistry {
         expression[BitwiseAnd]("&"),
         expression[BitwiseNot]("~"),
         expression[BitwiseOr]("|"),
    -    expression[BitwiseXor]("^")
    -
    +    expression[BitwiseXor]("^"),
    +
    +    // Cast aliases (SPARK-16730)
    +    castAlias("boolean", BooleanType),
    +    castAlias("tinyint", ByteType),
    +    castAlias("smallint", ShortType),
    +    castAlias("int", IntegerType),
    +    castAlias("bigint", LongType),
    +    castAlias("float", FloatType),
    +    castAlias("double", DoubleType),
    +    castAlias("decimal", DecimalType.USER_DEFAULT),
    --- End diff --
    
    This is not what Hive's default does, but what Spark SQL's cast default.
    
    I think it is a bug, but I'm not sure if it is intentional. I suggest we change this in a separate pull request, since there is more than one place to check.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    **[Test build #62909 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62909/consoleFull)** for PR 14364 at commit [`3b78da3`](https://github.com/apache/spark/commit/3b78da343c06b7f1df2a67136cda99b4b74bc0f7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by petermaxlee <gi...@git.apache.org>.
Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    Great. Thanks for merging.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72202307
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -408,8 +409,21 @@ object FunctionRegistry {
         expression[BitwiseAnd]("&"),
         expression[BitwiseNot]("~"),
         expression[BitwiseOr]("|"),
    -    expression[BitwiseXor]("^")
    -
    +    expression[BitwiseXor]("^"),
    +
    +    // Cast aliases (SPARK-16730)
    +    castAlias("boolean", BooleanType),
    --- End diff --
    
    in hive, if users create a udf called `boolean`, will hive throw exception or override the type casting one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    **[Test build #62902 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62902/consoleFull)** for PR 14364 at commit [`b8fbcab`](https://github.com/apache/spark/commit/b8fbcab1d5bf78f10b0edce2a1011080a38f4fc6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72379420
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -408,8 +409,21 @@ object FunctionRegistry {
         expression[BitwiseAnd]("&"),
         expression[BitwiseNot]("~"),
         expression[BitwiseOr]("|"),
    -    expression[BitwiseXor]("^")
    -
    +    expression[BitwiseXor]("^"),
    +
    +    // Cast aliases (SPARK-16730)
    +    castAlias("boolean", BooleanType),
    +    castAlias("tinyint", ByteType),
    +    castAlias("smallint", ShortType),
    +    castAlias("int", IntegerType),
    +    castAlias("bigint", LongType),
    +    castAlias("float", FloatType),
    +    castAlias("double", DoubleType),
    +    castAlias("decimal", DecimalType.USER_DEFAULT),
    --- End diff --
    
    can you double check it with hive? what's the default decimal type in hive?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    thanks, merging to master and 2.0!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    **[Test build #62909 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62909/consoleFull)** for PR 14364 at commit [`3b78da3`](https://github.com/apache/spark/commit/3b78da343c06b7f1df2a67136cda99b4b74bc0f7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72391502
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -452,14 +466,36 @@ object FunctionRegistry {
           }
         }
     
    -    val clazz = tag.runtimeClass
    +    (name, (expressionInfo[T](name), builder))
    +  }
    +
    +  /**
    +   * Creates a function lookup registry for cast aliases (SPARK-16730).
    --- End diff --
    
    NIT: `...function lookup registry...` should that ben`...function registry lookup entry ...` or something similar?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    **[Test build #62872 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62872/consoleFull)** for PR 14364 at commit [`8b087b2`](https://github.com/apache/spark/commit/8b087b2960071ff4c273b8d01c694fbfd4fac199).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by petermaxlee <gi...@git.apache.org>.
Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    I've updated the pull request based on @hvanhovell's comment.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by petermaxlee <gi...@git.apache.org>.
Github user petermaxlee commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    @cloud-fan this is an alternative implementation using FunctionRegistry.
    
    Let me know if you prefer this one over #14362.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by petermaxlee <gi...@git.apache.org>.
Github user petermaxlee commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72379213
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -408,8 +409,21 @@ object FunctionRegistry {
         expression[BitwiseAnd]("&"),
         expression[BitwiseNot]("~"),
         expression[BitwiseOr]("|"),
    -    expression[BitwiseXor]("^")
    -
    +    expression[BitwiseXor]("^"),
    +
    +    // Cast aliases (SPARK-16730)
    +    castAlias("boolean", BooleanType),
    +    castAlias("tinyint", ByteType),
    +    castAlias("smallint", ShortType),
    +    castAlias("int", IntegerType),
    +    castAlias("bigint", LongType),
    --- End diff --
    
    I think that's actually worse, because it makes it less clear what the function name is by looking at this source file. Also if for some reason we change LongType.simpleString in the future, these functions will subtly break.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14364#discussion_r72391788
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -408,8 +409,21 @@ object FunctionRegistry {
         expression[BitwiseAnd]("&"),
         expression[BitwiseNot]("~"),
         expression[BitwiseOr]("|"),
    -    expression[BitwiseXor]("^")
    -
    +    expression[BitwiseXor]("^"),
    +
    +    // Cast aliases (SPARK-16730)
    +    castAlias("boolean", BooleanType),
    +    castAlias("tinyint", ByteType),
    +    castAlias("smallint", ShortType),
    +    castAlias("int", IntegerType),
    +    castAlias("bigint", LongType),
    +    castAlias("float", FloatType),
    +    castAlias("double", DoubleType),
    +    castAlias("decimal", DecimalType.USER_DEFAULT),
    --- End diff --
    
    What does Hive do?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14364
  
    **[Test build #62928 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62928/consoleFull)** for PR 14364 at commit [`5bbe995`](https://github.com/apache/spark/commit/5bbe995fba0a3355b0edc795bd7e9188e6227188).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org