You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by adrian-wang <gi...@git.apache.org> on 2014/09/10 14:02:18 UTC

[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

GitHub user adrian-wang opened a pull request:

    https://github.com/apache/spark/pull/2344

    [SPARK-3407][SQL]Add Date type support

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/adrian-wang/spark date

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2344.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2344
    
----
commit 8edb30dfa0ffbdc18f87ce042badd8e52fe7c625
Author: Daoyuan Wang <da...@intel.com>
Date:   2014-09-10T11:42:00Z

    support date type

commit ce576b3730fb5942f78ebc7f3ee6472777d75fbd
Author: Daoyuan Wang <da...@intel.com>
Date:   2014-09-10T12:00:01Z

    set test list

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2344


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58659730
  
    retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17711980
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -221,19 +221,35 @@ trait HiveTypeCoercion {
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
           case p: BinaryPredicate if p.left.dataType == StringType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, DateType), p.right))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == StringType =>
    +        p.makeCopy(Array(p.left, Cast(p.right, DateType)))
    +      case p: BinaryPredicate if p.left.dataType == StringType
             && p.right.dataType == TimestampType =>
             p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
           case p: BinaryPredicate if p.left.dataType == TimestampType
             && p.right.dataType == StringType =>
             p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
    +      case p: BinaryPredicate if p.left.dataType == TimestampType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, DateType), p.right))
    --- End diff --
    
    To get the same result from hive, seems we have to change everything to string here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58646718
  
    I can see most of the reviews are focused on comparing and ordering. I'd like to fix those comparing rules in a separated PR. I tested what you declared here the other day, that was also different from what Hive did. Also Python API is can be a separated PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58656123
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21592/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18695986
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -220,20 +220,44 @@ trait HiveTypeCoercion {
           case a: BinaryArithmetic if a.right.dataType == StringType =>
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
    +      // we should cast all timestamp/date/string compare into string compare
    +      case p: BinaryPredicate if p.left.dataType == StringType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == StringType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
           case p: BinaryPredicate if p.left.dataType == StringType
             && p.right.dataType == TimestampType =>
    -        p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
           case p: BinaryPredicate if p.left.dataType == TimestampType
             && p.right.dataType == StringType =>
    -        p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
    +      case p: BinaryPredicate if p.left.dataType == TimestampType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == TimestampType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
    --- End diff --
    
    How about turning `Date`/`Timestamp` comparison to `Long` comparison? String and long representations of `Timestamp` are both accurate to seconds.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55889454
  
    Just rebase code, the failure is in spark-core and not in this patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL][WIP]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55365864
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20203/consoleFull) for   PR 2344 at commit [`7ed5450`](https://github.com/apache/spark/commit/7ed545057408c3a6d6e4acb2a6d3bb7692010d73).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56619771
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20737/consoleFull) for   PR 2344 at commit [`be6a958`](https://github.com/apache/spark/commit/be6a958284932b0685656869b6586f1159cdb599).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58950576
  
    Thanks for working on this huge feature!  Merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17710004
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -95,6 +100,8 @@ case class Cast(child: Expression, dataType: DataType) extends UnaryExpression {
           buildCast[Short](_, s => new Timestamp(s * 1000))
         case ByteType =>
           buildCast[Byte](_, b => new Timestamp(b * 1000))
    +    case DateType =>
    +      buildCast[Date](_, d => Timestamp.valueOf(dateToString(d) + " 00:00:00"))
    --- End diff --
    
    If the date `d` is polluted, for example `d.setTime(d.getTime + 1)`, this would be in a turmoil. I'm not sure if there's any way to get into that. Maybe before that really occurs, we can just use `new Timestamp(d.getTime)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58736899
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21615/Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-57081729
  
    **[Tests timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20933/consoleFull)** after     a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56329240
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20642/consoleFull) for   PR 2344 at commit [`92b31b4`](https://github.com/apache/spark/commit/92b31b4ef9f86ff05c0d0429eb382e27da1fe18e).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `abstract class LogicalPlan extends QueryPlan[LogicalPlan] with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55123057
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20103/consoleFull) for   PR 2344 at commit [`ce576b3`](https://github.com/apache/spark/commit/ce576b3730fb5942f78ebc7f3ee6472777d75fbd).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-57079061
  
    Hive is really strange dealing with date, there's still some inconsistency between, but I think it is better to update in following PRs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL][WIP]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55363236
  
    @marmbrus Cheng Hao's PR #2368 will resolve the second issue :).
    After #2368 and #2355 are merged, I'll rebase my code and add `date_udf` to white list.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55110094
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20103/consoleFull) for   PR 2344 at commit [`ce576b3`](https://github.com/apache/spark/commit/ce576b3730fb5942f78ebc7f3ee6472777d75fbd).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58651729
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21591/consoleFull) for   PR 2344 at commit [`00fe81f`](https://github.com/apache/spark/commit/00fe81f5586118694642422f94846f48e6ca50fc).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58604586
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/329/consoleFull) for   PR 2344 at commit [`0c339b0`](https://github.com/apache/spark/commit/0c339b0b4b5572327ff1d61d9c441dabde6d1b45).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18697748
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -56,7 +60,9 @@ case class Cast(child: Expression, dataType: DataType) extends UnaryExpression {
         case StringType =>
           buildCast[String](_, _.length() != 0)
         case TimestampType =>
    -      buildCast[Timestamp](_, b => b.getTime() != 0 || b.getNanos() != 0)
    +      buildCast[Timestamp](_, t => t.getTime() != 0 || t.getNanos() != 0)
    +    case DateType =>
    +      buildCast[Date](_, d => null)
    --- End diff --
    
    Leaving a comment here would be good. It's really unintuitive here to see a timestamp can be casted to a boolean while a date has to be null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17765733
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala ---
    @@ -170,6 +170,18 @@ case object TimestampType extends NativeType {
       def simpleString: String = "timestamp"
     }
     
    +case object DateType extends NativeType {
    +  private[sql] type JvmType = Date
    +
    +  @transient private[sql] lazy val tag = ScalaReflectionLock.synchronized { typeTag[JvmType] }
    +
    +  private[sql] val ordering = new Ordering[JvmType] {
    +    def compare(x: Date, y: Date) = x.compareTo(y)
    --- End diff --
    
    I've checked the logic of `java.sql.Date`.`compareTo`, and it is not the same as `DateWritable`.`compareTo`, which is the internal representation in Hive. The former will compare its milliseconds, but the later only compare the `days since Epoch`, probably we need to follow the same semantic with Hive here.
    BTW: If we change the logic here, does that also mean we needn't cast the `date` to `string` for `BinaryPredicate` in `HiveTypeCoercion`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18739665
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -220,20 +220,44 @@ trait HiveTypeCoercion {
           case a: BinaryArithmetic if a.right.dataType == StringType =>
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
    +      // we should cast all timestamp/date/string compare into string compare
    +      case p: BinaryPredicate if p.left.dataType == StringType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == StringType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
           case p: BinaryPredicate if p.left.dataType == StringType
             && p.right.dataType == TimestampType =>
    -        p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
           case p: BinaryPredicate if p.left.dataType == TimestampType
             && p.right.dataType == StringType =>
    -        p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
    +      case p: BinaryPredicate if p.left.dataType == TimestampType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == TimestampType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
    --- End diff --
    
    So Michael agreed to leave the whole ordering and comparing stuff in a separated PR :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55968540
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20495/consoleFull) for   PR 2344 at commit [`413f946`](https://github.com/apache/spark/commit/413f946ea2d9488b08186307bed550fa298a8a23).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58655457
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21591/consoleFull) for   PR 2344 at commit [`00fe81f`](https://github.com/apache/spark/commit/00fe81f5586118694642422f94846f48e6ca50fc).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56619774
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20737/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58735889
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21615/consoleFull) for   PR 2344 at commit [`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17710025
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -140,6 +147,39 @@ case class Cast(child: Expression, dataType: DataType) extends UnaryExpression {
         }
       }
     
    +  // Converts Timestamp to string according to Hive TimestampWritable convention
    +  private[this] def timestampToDateString(ts: Timestamp): String = {
    +    Cast.threadLocalDateFormat.get.format(ts)
    +  }
    +
    +  // DateConverter
    +  private[this] def castToDate: Any => Any = child.dataType match {
    +    case StringType =>
    +      buildCast[String](_, s => if (s.contains(" ")) {
    +        try castToDate(castToTimestamp(s))
    +        catch { case _: java.lang.IllegalArgumentException => null }
    +      } else {
    +        try Date.valueOf(s) catch { case _: java.lang.IllegalArgumentException => null }
    +      })
    +    case TimestampType =>
    +      buildCast[Timestamp](_, t => Date.valueOf(timestampToDateString(t)))
    +    // TimestampWritable.decimalToDate
    +    case _ =>
    +      _ => null
    +  }
    +
    +  // Date cannot be cast to long, according to hive
    +  private[this] def dateToLong(d: Date) = null
    --- End diff --
    
    Hive returns null if `cast(cast('2014-09-18' as date) as int)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58735806
  
    retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL][wip]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55216369
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20136/consoleFull) for   PR 2344 at commit [`7269bba`](https://github.com/apache/spark/commit/7269bba52e12e646e440b89fbbff82f12995c6d5).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL][WIP]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55349736
  
    The last thing here is to enable `date_udf.q`. Currently there are two issues:
    1. PR #2355 `org.apache.hadoop.hive.serde2.io.DateWritable` got two constructors take one parameter, one is `int` and the other `java.sql.Date`, would both be int `primitiveTypes` and cause random failure when call this in a function.
    2. Have to figure out how to enable `datediff(TimestampType, ...)`, currently this would lead to a `scala.Matcherror` because it seems try to cast `TimestampType` to `TimestampType` in `ConstantFolding`, still looking why this is not solved in `SimplifyCasts`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58619034
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21577/Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18696379
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -220,20 +220,44 @@ trait HiveTypeCoercion {
           case a: BinaryArithmetic if a.right.dataType == StringType =>
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
    +      // we should cast all timestamp/date/string compare into string compare
    +      case p: BinaryPredicate if p.left.dataType == StringType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == StringType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
           case p: BinaryPredicate if p.left.dataType == StringType
             && p.right.dataType == TimestampType =>
    -        p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
           case p: BinaryPredicate if p.left.dataType == TimestampType
             && p.right.dataType == StringType =>
    -        p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
    +      case p: BinaryPredicate if p.left.dataType == TimestampType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == TimestampType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
     
           case p: BinaryPredicate if p.left.dataType == StringType && p.right.dataType != StringType =>
             p.makeCopy(Array(Cast(p.left, DoubleType), p.right))
           case p: BinaryPredicate if p.left.dataType != StringType && p.right.dataType == StringType =>
             p.makeCopy(Array(p.left, Cast(p.right, DoubleType)))
     
    -      case i @ In(a,b) if a.dataType == TimestampType && b.forall(_.dataType == StringType) =>
    -        i.makeCopy(Array(a,b.map(Cast(_,TimestampType))))
    +      case i @ In(a, b) if a.dataType == DateType && b.forall(_.dataType == StringType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b))
    +      case i @ In(a, b) if a.dataType == TimestampType && b.forall(_.dataType == StringType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b))
    +      case i @ In(a, b) if a.dataType == DateType && b.forall(_.dataType == TimestampType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType))))
    +      case i @ In(a, b) if a.dataType == TimestampType && b.forall(_.dataType == DateType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType))))
    +      case i @ In(a, b) if a.dataType == DateType && b.forall(_.dataType == DateType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType))))
    +      case i @ In(a, b) if a.dataType == TimestampType && b.forall(_.dataType == TimestampType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType))))
    --- End diff --
    
    The 4 case clauses above can also use long instead of string (and they probably should be moved to some rule other than `PromoteStrings`).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-57057116
  
    @marmbrus , just in case it will be out of date soon...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58658721
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21593/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18686823
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala ---
    @@ -181,6 +181,18 @@ case object TimestampType extends NativeType {
       def simpleString: String = "timestamp"
     }
     
    +case object DateType extends NativeType {
    +  private[sql] type JvmType = Date
    +
    +  @transient private[sql] lazy val tag = ScalaReflectionLock.synchronized { typeTag[JvmType] }
    +
    +  private[sql] val ordering = new Ordering[JvmType] {
    +    def compare(x: Date, y: Date) = x.toString.compareTo(y.toString)
    +  }
    +
    +  def simpleString: String = "date"
    --- End diff --
    
    I'll rebase my code then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-57555702
  
    I'm fine with finishing this in another PR, but can you remove the failing test from the whitelist?  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56022043
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20537/consoleFull) for   PR 2344 at commit [`458c113`](https://github.com/apache/spark/commit/458c11394c2e8788cee7664ed8c6f094a3df9f97).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class JavaSparkContext(val sc: SparkContext)`
      * `  class ArrayConstructor extends net.razorvine.pickle.objects.ArrayConstructor `
      * `      throw new IllegalStateException("The main method in the given main class must be static")`
      * `class TaskCompletionListenerException(errorMessages: Seq[String]) extends Exception `
      * `class NonASCIICharacterChecker extends ScalariformChecker `
      * `        class Dummy(object):`
      * `class RatingDeserializer(FramedSerializer):`
      * `class SCCallSiteSync(object):`
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `case class CreateTableAsSelect(`
      * `public class DateType extends DataType `
      * `  class Encoder[T <: NativeType](columnType: NativeColumnType[T]) extends compression.Encoder[T] `
      * `  class Encoder[T <: NativeType](columnType: NativeColumnType[T]) extends compression.Encoder[T] `
      * `  class Encoder[T <: NativeType](columnType: NativeColumnType[T]) extends compression.Encoder[T] `
      * `  class Encoder extends compression.Encoder[IntegerType.type] `
      * `  class Decoder(buffer: ByteBuffer, columnType: NativeColumnType[IntegerType.type])`
      * `  class Encoder extends compression.Encoder[LongType.type] `
      * `  class Decoder(buffer: ByteBuffer, columnType: NativeColumnType[LongType.type])`
      * `case class CreateTableAsSelect(`
      * `class JavaStreamingContext(val ssc: StreamingContext) extends Closeable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-57062345
  
    Sorry for the delay, this week has been very busy!  I'd like to merge this soon, only one small question.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17703854
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -95,6 +100,8 @@ case class Cast(child: Expression, dataType: DataType) extends UnaryExpression {
           buildCast[Short](_, s => new Timestamp(s * 1000))
         case ByteType =>
           buildCast[Byte](_, b => new Timestamp(b * 1000))
    +    case DateType =>
    +      buildCast[Date](_, d => Timestamp.valueOf(dateToString(d) + " 00:00:00"))
    --- End diff --
    
    Will that be more efficient `new Timestamp(d.getTime())`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56018966
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20537/consoleFull) for   PR 2344 at commit [`458c113`](https://github.com/apache/spark/commit/458c11394c2e8788cee7664ed8c6f094a3df9f97).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56327272
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20642/consoleFull) for   PR 2344 at commit [`92b31b4`](https://github.com/apache/spark/commit/92b31b4ef9f86ff05c0d0429eb382e27da1fe18e).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17832080
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala ---
    @@ -170,6 +170,18 @@ case object TimestampType extends NativeType {
       def simpleString: String = "timestamp"
     }
     
    +case object DateType extends NativeType {
    +  private[sql] type JvmType = Date
    +
    +  @transient private[sql] lazy val tag = ScalaReflectionLock.synchronized { typeTag[JvmType] }
    +
    +  private[sql] val ordering = new Ordering[JvmType] {
    +    def compare(x: Date, y: Date) = x.compareTo(y)
    --- End diff --
    
    This function is only used for ordering, not for data comparison.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58654642
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21593/consoleFull) for   PR 2344 at commit [`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18698063
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnStats.scala ---
    @@ -190,6 +190,24 @@ private[sql] class StringColumnStats extends ColumnStats {
       def collectedStatistics = Row(lower, upper, nullCount)
     }
     
    +private[sql] class DateColumnStats extends ColumnStats {
    +  var upper: Date = null
    +  var lower: Date = null
    +  var nullCount = 0
    +
    +  override def gatherStats(row: Row, ordinal: Int) {
    +    if (!row.isNullAt(ordinal)) {
    +      val value = row(ordinal).asInstanceOf[Date]
    +      if (upper == null || value.toString.compareTo(upper.toString) > 0) upper = value
    +      if (lower == null || value.toString.compareTo(lower.toString) < 0) lower = value
    --- End diff --
    
    Using `.getTime` to do the comparison would be much more efficient. When caching a `DateType` column, this function is really a critical path.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56336063
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20647/consoleFull) for   PR 2344 at commit [`8e91db0`](https://github.com/apache/spark/commit/8e91db07e5df95fa7962be892185e5430d17b793).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56342088
  
    @adrian-wang clarified some of my concerns offline, I think generally it's ready to be merged. @marmbrus , @liancheng , any more comments on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18125250
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -220,20 +220,52 @@ trait HiveTypeCoercion {
           case a: BinaryArithmetic if a.right.dataType == StringType =>
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
    +      // we should cast all timestamp/date/string compare into string compare,
    +      // even if both sides are of same type, as Hive use xxxwritable to compare.
    --- End diff --
    
    Can you explain this more?  It seems more expensive to convert to strings and then compare strings instead of just comparing the underlying types.  What does writables have to do with this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-57081730
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20933/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18126862
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -220,20 +220,52 @@ trait HiveTypeCoercion {
           case a: BinaryArithmetic if a.right.dataType == StringType =>
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
    +      // we should cast all timestamp/date/string compare into string compare,
    +      // even if both sides are of same type, as Hive use xxxwritable to compare.
    --- End diff --
    
    The native `compareTo` of both `java.sql.Date` point to `java.util.Date`, which compares the millis since epoch, but `DateWritable` compares days since epoch, so here is the gap.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58657404
  
    Seems we are getting something wrong with orc in jenkins... My local test is fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-57079131
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20933/consoleFull) for   PR 2344 at commit [`f4058ab`](https://github.com/apache/spark/commit/f4058ab1a185b3dc3fb5fe1522ef5b481601d873).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17703765
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -221,19 +221,35 @@ trait HiveTypeCoercion {
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
           case p: BinaryPredicate if p.left.dataType == StringType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, DateType), p.right))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == StringType =>
    +        p.makeCopy(Array(p.left, Cast(p.right, DateType)))
    +      case p: BinaryPredicate if p.left.dataType == StringType
             && p.right.dataType == TimestampType =>
             p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
           case p: BinaryPredicate if p.left.dataType == TimestampType
             && p.right.dataType == StringType =>
             p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
    +      case p: BinaryPredicate if p.left.dataType == TimestampType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, DateType), p.right))
    --- End diff --
    
    I didn't check this in Hive, but probably it's better to convert the `Date` => `Timestamp`, since `Timestamp` is more precise. The same for the rule `In` .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58616217
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21577/consoleFull) for   PR 2344 at commit [`0df6ea1`](https://github.com/apache/spark/commit/0df6ea1964710e6376daac282686da7fb35c29b2).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56327106
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20640/consoleFull) for   PR 2344 at commit [`e3b12af`](https://github.com/apache/spark/commit/e3b12afca1a06ccf37d032187f7ff60f19793fef).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58639081
  
    Left some minor comments, otherwise LGTM.  Also, do we want to add Python API in this PR or a separate one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18739656
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -220,20 +220,44 @@ trait HiveTypeCoercion {
           case a: BinaryArithmetic if a.right.dataType == StringType =>
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
    +      // we should cast all timestamp/date/string compare into string compare
    +      case p: BinaryPredicate if p.left.dataType == StringType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == StringType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
           case p: BinaryPredicate if p.left.dataType == StringType
             && p.right.dataType == TimestampType =>
    -        p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
           case p: BinaryPredicate if p.left.dataType == TimestampType
             && p.right.dataType == StringType =>
    -        p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
    +      case p: BinaryPredicate if p.left.dataType == TimestampType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == TimestampType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
    --- End diff --
    
    OK... verified this behavior with Hive, I've no idea about this :(


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55983850
  
    A few comments on returns null V.S. raise exception in `Cast`, and which is the more wider type for `Date` and `Timestamp`, the other LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58658714
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21593/consoleFull) for   PR 2344 at commit [`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58665073
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21594/consoleFull) for   PR 2344 at commit [`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18685528
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala ---
    @@ -181,6 +181,18 @@ case object TimestampType extends NativeType {
       def simpleString: String = "timestamp"
     }
     
    +case object DateType extends NativeType {
    +  private[sql] type JvmType = Date
    +
    +  @transient private[sql] lazy val tag = ScalaReflectionLock.synchronized { typeTag[JvmType] }
    +
    +  private[sql] val ordering = new Ordering[JvmType] {
    +    def compare(x: Date, y: Date) = x.toString.compareTo(y.toString)
    +  }
    +
    +  def simpleString: String = "date"
    --- End diff --
    
    `simpleString` can be removed now. #2563 renamed this method to `typeName` and a default implementation is provided in `DataType`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58657652
  
    I guess this one is related: https://github.com/apache/spark/commit/411cf29fff011561f0093bb6101af87842828369


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58655469
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21591/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56332589
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20647/consoleFull) for   PR 2344 at commit [`8e91db0`](https://github.com/apache/spark/commit/8e91db07e5df95fa7962be892185e5430d17b793).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-57550528
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/238/consoleFull) for   PR 2344 at commit [`f4058ab`](https://github.com/apache/spark/commit/f4058ab1a185b3dc3fb5fe1522ef5b481601d873).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56018809
  
    Though Hive would throw exception when cast int to date, I think it doesn't matter if not obey, because query exit with exception is considered a failure in `runHive`. Let's keep cast int to date as null, exactly as what hive does with cast string to int.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56327262
  
    have rebased and addressed comments from @chenghao-intel .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55967788
  
    test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17765726
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala ---
    @@ -170,6 +170,18 @@ case object TimestampType extends NativeType {
       def simpleString: String = "timestamp"
     }
     
    +case object DateType extends NativeType {
    +  private[sql] type JvmType = Date
    +
    +  @transient private[sql] lazy val tag = ScalaReflectionLock.synchronized { typeTag[JvmType] }
    +
    +  private[sql] val ordering = new Ordering[JvmType] {
    +    def compare(x: Date, y: Date) = x.compareTo(y)
    --- End diff --
    
    I've checked the logic of `java.sql.Date`.`compareTo`, and it is not the same as `DateWritable`.`compareTo`, which is the internal representation in Hive. The former will compare its milliseconds, but the later only compare the `days since Epoch`, probably we need to follow the same semantic with Hive here.
    BTW: If we change the logic here, does that also mean we needn't cast the `date` to `string` for `BinaryPredicate` in `HiveTypeCoercion`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18705174
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -220,20 +220,44 @@ trait HiveTypeCoercion {
           case a: BinaryArithmetic if a.right.dataType == StringType =>
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
    +      // we should cast all timestamp/date/string compare into string compare
    +      case p: BinaryPredicate if p.left.dataType == StringType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == StringType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
           case p: BinaryPredicate if p.left.dataType == StringType
             && p.right.dataType == TimestampType =>
    -        p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
           case p: BinaryPredicate if p.left.dataType == TimestampType
             && p.right.dataType == StringType =>
    -        p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
    +      case p: BinaryPredicate if p.left.dataType == TimestampType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == TimestampType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
    --- End diff --
    
    It seems cast('1970-01-01' as date) < cast('1970-01-01 00:00:00' as timestamp)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55969213
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20495/consoleFull) for   PR 2344 at commit [`413f946`](https://github.com/apache/spark/commit/413f946ea2d9488b08186307bed550fa298a8a23).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17766373
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala ---
    @@ -170,6 +170,18 @@ case object TimestampType extends NativeType {
       def simpleString: String = "timestamp"
     }
     
    +case object DateType extends NativeType {
    +  private[sql] type JvmType = Date
    +
    +  @transient private[sql] lazy val tag = ScalaReflectionLock.synchronized { typeTag[JvmType] }
    +
    +  private[sql] val ordering = new Ordering[JvmType] {
    +    def compare(x: Date, y: Date) = x.compareTo(y)
    --- End diff --
    
    Do we also need to modify `compareTo` from `TimestampType`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56575927
  
    Yeah, I think this is probably ready to merge.  Mind updating / resolving conflicts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55222163
  
    Ha, first pass in recent days!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17704236
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -177,6 +223,8 @@ case class Cast(child: Expression, dataType: DataType) extends UnaryExpression {
           })
         case BooleanType =>
           buildCast[Boolean](_, b => if (b) 1.toShort else 0.toShort)
    +    case DateType =>
    --- End diff --
    
    If `Datetype` can not be casted into `IntegerType` / `ShortType` / `LongType`, let's remove this, raise exception in `compile` time probably better than in `runtime`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58299209
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21442/consoleFull) for   PR 2344 at commit [`0c339b0`](https://github.com/apache/spark/commit/0c339b0b4b5572327ff1d61d9c441dabde6d1b45).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17704012
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -140,6 +147,39 @@ case class Cast(child: Expression, dataType: DataType) extends UnaryExpression {
         }
       }
     
    +  // Converts Timestamp to string according to Hive TimestampWritable convention
    +  private[this] def timestampToDateString(ts: Timestamp): String = {
    +    Cast.threadLocalDateFormat.get.format(ts)
    +  }
    +
    +  // DateConverter
    +  private[this] def castToDate: Any => Any = child.dataType match {
    --- End diff --
    
    We probably need to compare the date conversion with Hive, https://github.com/apache/hive/blob/branch-0.12/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java#L1032
    
    Particularly for `null` returns or throws exception.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56617051
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20737/consoleFull) for   PR 2344 at commit [`be6a958`](https://github.com/apache/spark/commit/be6a958284932b0685656869b6586f1159cdb599).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58303279
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21442/Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58619031
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21577/consoleFull) for   PR 2344 at commit [`0df6ea1`](https://github.com/apache/spark/commit/0df6ea1964710e6376daac282686da7fb35c29b2).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56028120
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20539/consoleFull) for   PR 2344 at commit [`252e84a`](https://github.com/apache/spark/commit/252e84a185191be5bb07681561e4f16c80fc8bf9).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17704067
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -140,6 +147,39 @@ case class Cast(child: Expression, dataType: DataType) extends UnaryExpression {
         }
       }
     
    +  // Converts Timestamp to string according to Hive TimestampWritable convention
    +  private[this] def timestampToDateString(ts: Timestamp): String = {
    +    Cast.threadLocalDateFormat.get.format(ts)
    +  }
    +
    +  // DateConverter
    +  private[this] def castToDate: Any => Any = child.dataType match {
    +    case StringType =>
    +      buildCast[String](_, s => if (s.contains(" ")) {
    +        try castToDate(castToTimestamp(s))
    +        catch { case _: java.lang.IllegalArgumentException => null }
    +      } else {
    +        try Date.valueOf(s) catch { case _: java.lang.IllegalArgumentException => null }
    +      })
    +    case TimestampType =>
    +      buildCast[Timestamp](_, t => Date.valueOf(timestampToDateString(t)))
    +    // TimestampWritable.decimalToDate
    +    case _ =>
    +      _ => null
    +  }
    +
    +  // Date cannot be cast to long, according to hive
    +  private[this] def dateToLong(d: Date) = null
    --- End diff --
    
    Throw exception instead? Seems Hive does that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58736896
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21615/consoleFull) for   PR 2344 at commit [`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56328916
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20640/consoleFull) for   PR 2344 at commit [`e3b12af`](https://github.com/apache/spark/commit/e3b12afca1a06ccf37d032187f7ff60f19793fef).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17765755
  
    --- Diff: sql/core/src/main/java/org/apache/spark/sql/api/java/DateType.java ---
    @@ -0,0 +1,27 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.api.java;
    +
    +/**
    + * The data type representing java.sql.Timestamp values.
    + *
    + * {@code TimestampType} is represented by the singleton object {@link DataType#TimestampType}.
    --- End diff --
    
    Update the comment, please. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL][WIP]Add Date type support

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55337715
  
    This looks pretty awesome!  I'd like to do a quick pass and make sure there aren't any missing places, but overall this looks complete to me.  Is there a reason the PR title still says WIP?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17718951
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -177,6 +223,8 @@ case class Cast(child: Expression, dataType: DataType) extends UnaryExpression {
           })
         case BooleanType =>
           buildCast[Boolean](_, b => if (b) 1.toShort else 0.toShort)
    +    case DateType =>
    --- End diff --
    
    Hive returns NULL when cast from DATE to INT., etc


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58660519
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21594/consoleFull) for   PR 2344 at commit [`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56032559
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20539/consoleFull) for   PR 2344 at commit [`252e84a`](https://github.com/apache/spark/commit/252e84a185191be5bb07681561e4f16c80fc8bf9).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL][wip]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55221787
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20136/consoleFull) for   PR 2344 at commit [`7269bba`](https://github.com/apache/spark/commit/7269bba52e12e646e440b89fbbff82f12995c6d5).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58665080
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21594/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18704345
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -220,20 +220,44 @@ trait HiveTypeCoercion {
           case a: BinaryArithmetic if a.right.dataType == StringType =>
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
    +      // we should cast all timestamp/date/string compare into string compare
    +      case p: BinaryPredicate if p.left.dataType == StringType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == StringType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
           case p: BinaryPredicate if p.left.dataType == StringType
             && p.right.dataType == TimestampType =>
    -        p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
    +        p.makeCopy(Array(p.left, Cast(p.right, StringType)))
           case p: BinaryPredicate if p.left.dataType == TimestampType
             && p.right.dataType == StringType =>
    -        p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
    +        p.makeCopy(Array(Cast(p.left, StringType), p.right))
    +      case p: BinaryPredicate if p.left.dataType == TimestampType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == TimestampType =>
    +        p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType)))
     
           case p: BinaryPredicate if p.left.dataType == StringType && p.right.dataType != StringType =>
             p.makeCopy(Array(Cast(p.left, DoubleType), p.right))
           case p: BinaryPredicate if p.left.dataType != StringType && p.right.dataType == StringType =>
             p.makeCopy(Array(p.left, Cast(p.right, DoubleType)))
     
    -      case i @ In(a,b) if a.dataType == TimestampType && b.forall(_.dataType == StringType) =>
    -        i.makeCopy(Array(a,b.map(Cast(_,TimestampType))))
    +      case i @ In(a, b) if a.dataType == DateType && b.forall(_.dataType == StringType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b))
    +      case i @ In(a, b) if a.dataType == TimestampType && b.forall(_.dataType == StringType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b))
    +      case i @ In(a, b) if a.dataType == DateType && b.forall(_.dataType == TimestampType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType))))
    +      case i @ In(a, b) if a.dataType == TimestampType && b.forall(_.dataType == DateType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType))))
    +      case i @ In(a, b) if a.dataType == DateType && b.forall(_.dataType == DateType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType))))
    +      case i @ In(a, b) if a.dataType == TimestampType && b.forall(_.dataType == TimestampType) =>
    +        i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType))))
    --- End diff --
    
    Using long value of `java.sql.Date` is way too far from Hive. I have discussed with @chenghao-intel and We think we can leave it as it is and fix the ordering and comparing in a separate PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55877253
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20463/consoleFull) for   PR 2344 at commit [`413f946`](https://github.com/apache/spark/commit/413f946ea2d9488b08186307bed550fa298a8a23).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL][WIP]Add Date type support

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17453165
  
    --- Diff: .gitignore ---
    @@ -19,6 +19,7 @@ conf/spark-env.sh
     conf/streaming-env.sh
     conf/log4j.properties
     conf/spark-defaults.conf
    +conf/hive-log4j.properties
    --- End diff --
    
    We are considering removing the need for this special config file in #2263 so maybe best to leave this out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58303274
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21442/consoleFull) for   PR 2344 at commit [`0c339b0`](https://github.com/apache/spark/commit/0c339b0b4b5572327ff1d61d9c441dabde6d1b45).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18698218
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
    @@ -376,8 +395,8 @@ private[sql] sealed abstract class ByteArrayColumnType[T <: DataType](
       }
     }
     
    -private[sql] object BINARY extends ByteArrayColumnType[BinaryType.type](9, 16) {
    -  override def setField(row: MutableRow, ordinal: Int, value: Array[Byte]): Unit = {
    +private[sql] object BINARY extends ByteArrayColumnType[BinaryType.type](10, 16) {
    +  override def setField(row: MutableRow, ordinal: Int, value: Array[Byte]) {
    --- End diff --
    
    Nit: Hmm, actually `: Unit =` is more recommended :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58600181
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/329/consoleFull) for   PR 2344 at commit [`0c339b0`](https://github.com/apache/spark/commit/0c339b0b4b5572327ff1d61d9c441dabde6d1b45).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58656118
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21592/consoleFull) for   PR 2344 at commit [`2038085`](https://github.com/apache/spark/commit/203808565ccb6f1e503b4feb2a8eac4fe9b59408).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging `
      * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r18126726
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -220,20 +220,52 @@ trait HiveTypeCoercion {
           case a: BinaryArithmetic if a.right.dataType == StringType =>
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
    +      // we should cast all timestamp/date/string compare into string compare,
    +      // even if both sides are of same type, as Hive use xxxwritable to compare.
    --- End diff --
    
    I considered this question again and now think when comparing same types, it is better not to convert to string but write `compareTo` methods for them, since the native `compareTo` method of `java.sql.date` seems not consistent with `DateWritable`. I'll do a quick follow up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17703950
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -140,6 +147,39 @@ case class Cast(child: Expression, dataType: DataType) extends UnaryExpression {
         }
       }
     
    +  // Converts Timestamp to string according to Hive TimestampWritable convention
    +  private[this] def timestampToDateString(ts: Timestamp): String = {
    +    Cast.threadLocalDateFormat.get.format(ts)
    +  }
    +
    +  // DateConverter
    +  private[this] def castToDate: Any => Any = child.dataType match {
    +    case StringType =>
    +      buildCast[String](_, s => if (s.contains(" ")) {
    +        try castToDate(castToTimestamp(s))
    +        catch { case _: java.lang.IllegalArgumentException => null }
    +      } else {
    +        try Date.valueOf(s) catch { case _: java.lang.IllegalArgumentException => null }
    +      })
    +    case TimestampType =>
    +      buildCast[Timestamp](_, t => Date.valueOf(timestampToDateString(t)))
    +    // TimestampWritable.decimalToDate
    +    case _ =>
    --- End diff --
    
    Nit: Keep in the same line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-55870259
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20463/consoleFull) for   PR 2344 at commit [`413f946`](https://github.com/apache/spark/commit/413f946ea2d9488b08186307bed550fa298a8a23).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-58652251
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21592/consoleFull) for   PR 2344 at commit [`2038085`](https://github.com/apache/spark/commit/203808565ccb6f1e503b4feb2a8eac4fe9b59408).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2344#discussion_r17703775
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -221,19 +221,35 @@ trait HiveTypeCoercion {
             a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
     
           case p: BinaryPredicate if p.left.dataType == StringType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, DateType), p.right))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == StringType =>
    +        p.makeCopy(Array(p.left, Cast(p.right, DateType)))
    +      case p: BinaryPredicate if p.left.dataType == StringType
             && p.right.dataType == TimestampType =>
             p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
           case p: BinaryPredicate if p.left.dataType == TimestampType
             && p.right.dataType == StringType =>
             p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
    +      case p: BinaryPredicate if p.left.dataType == TimestampType
    +        && p.right.dataType == DateType =>
    +        p.makeCopy(Array(Cast(p.left, DateType), p.right))
    +      case p: BinaryPredicate if p.left.dataType == DateType
    +        && p.right.dataType == TimestampType =>
    +        p.makeCopy(Array(p.left, Cast(p.right, DateType)))
     
           case p: BinaryPredicate if p.left.dataType == StringType && p.right.dataType != StringType =>
             p.makeCopy(Array(Cast(p.left, DoubleType), p.right))
           case p: BinaryPredicate if p.left.dataType != StringType && p.right.dataType == StringType =>
             p.makeCopy(Array(p.left, Cast(p.right, DoubleType)))
     
    +      case i @ In(a,b) if a.dataType == DateType && b.forall(_.dataType == StringType) =>
    +        i.makeCopy(Array(a,b.map(Cast(_,DateType))))
    --- End diff --
    
    Nit: space after `,`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-57554545
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/238/consoleFull) for   PR 2344 at commit [`f4058ab`](https://github.com/apache/spark/commit/f4058ab1a185b3dc3fb5fe1522ef5b481601d873).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

Posted by adrian-wang <gi...@git.apache.org>.
Github user adrian-wang commented on the pull request:

    https://github.com/apache/spark/pull/2344#issuecomment-56616986
  
    @marmbrus , I have updated on current master and run sbt tests locally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org