You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jamesthomp <gi...@git.apache.org> on 2018/05/02 16:08:48 UTC

[GitHub] spark pull request #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIME...

GitHub user jamesthomp opened a pull request:

    https://github.com/apache/spark/pull/21217

    [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP to be case insensitive

    ## What changes were proposed in this pull request?
    
    This PR adjusts the behavior of CURRENT_DATE and CURRENT_TIMESTAMP to be case insensitive. This was previously the behavior prior to the merge of SPARK-22333.
    
    ## How was this patch tested?
    
    Existing tests + added a new test to specifically catch this case.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/palantir/spark jt/case-sensitive-literals

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21217.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21217
    
----
commit 662dd2e1dcd9de2ada87811e116b420f138cfa13
Author: James Thompson <ja...@...>
Date:   2018-04-30T20:28:37Z

    fix case sensitive literals

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93059/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIME...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21217


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIME...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21217#discussion_r186280533
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1812,6 +1812,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
       - Since Spark 2.4, creating a managed table with nonempty location is not allowed. An exception is thrown when attempting to create a managed table with nonempty location. To set `true` to `spark.sql.allowCreatingManagedTableUsingNonemptyLocation` restores the previous behavior. This option will be removed in Spark 3.0.
       - Since Spark 2.4, the type coercion rules can automatically promote the argument types of the variadic SQL functions (e.g., IN/COALESCE) to the widest common type, no matter how the input arguments order. In prior Spark versions, the promotion could fail in some specific orders (e.g., TimestampType, IntegerType and StringType) and throw an exception.
       - In version 2.3 and earlier, `to_utc_timestamp` and `from_utc_timestamp` respect the timezone in the input timestamp string, which breaks the assumption that the input timestamp is in a specific timezone. Therefore, these 2 functions can return unexpected results. In version 2.4 and later, this problem has been fixed. `to_utc_timestamp` and `from_utc_timestamp` will return null if the input timestamp string contains timezone. As an example, `from_utc_timestamp('2000-10-10 00:00:00', 'GMT+1')` will return `2000-10-10 01:00:00` in both Spark 2.3 and 2.4. However, `from_utc_timestamp('2000-10-10 00:00:00+00:00', 'GMT+1')`, assuming a local timezone of GMT+8, will return `2000-10-10 09:00:00` in Spark 2.3 but `null` in 2.4. For people who don't care about this problem and want to retain the previous behaivor to keep their query unchanged, you can set `spark.sql.function.rejectTimezoneInString` to false. This option will be removed in Spark 3.0 and should only be used as a tempora
 ry workaround.
    +  - In version 2.3, if `spark.sql.caseSensitive` is set to true, then the `CURRENT_DATE` and `CURRENT_TIMESTAMP` functions incorrectly became case-sensitive and would resolve to columns (unless typed in lower case). In Spark 2.4 this has been fixed and the functions are no longer case-sensitive.
    --- End diff --
    
    +1


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIME...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21217#discussion_r186276456
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1812,6 +1812,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
       - Since Spark 2.4, creating a managed table with nonempty location is not allowed. An exception is thrown when attempting to create a managed table with nonempty location. To set `true` to `spark.sql.allowCreatingManagedTableUsingNonemptyLocation` restores the previous behavior. This option will be removed in Spark 3.0.
       - Since Spark 2.4, the type coercion rules can automatically promote the argument types of the variadic SQL functions (e.g., IN/COALESCE) to the widest common type, no matter how the input arguments order. In prior Spark versions, the promotion could fail in some specific orders (e.g., TimestampType, IntegerType and StringType) and throw an exception.
       - In version 2.3 and earlier, `to_utc_timestamp` and `from_utc_timestamp` respect the timezone in the input timestamp string, which breaks the assumption that the input timestamp is in a specific timezone. Therefore, these 2 functions can return unexpected results. In version 2.4 and later, this problem has been fixed. `to_utc_timestamp` and `from_utc_timestamp` will return null if the input timestamp string contains timezone. As an example, `from_utc_timestamp('2000-10-10 00:00:00', 'GMT+1')` will return `2000-10-10 01:00:00` in both Spark 2.3 and 2.4. However, `from_utc_timestamp('2000-10-10 00:00:00+00:00', 'GMT+1')`, assuming a local timezone of GMT+8, will return `2000-10-10 09:00:00` in Spark 2.3 but `null` in 2.4. For people who don't care about this problem and want to retain the previous behaivor to keep their query unchanged, you can set `spark.sql.function.rejectTimezoneInString` to false. This option will be removed in Spark 3.0 and should only be used as a tempora
 ry workaround.
    +  - In version 2.3, if `spark.sql.caseSensitive` is set to true, then the `CURRENT_DATE` and `CURRENT_TIMESTAMP` functions incorrectly became case-sensitive and would resolve to columns (unless typed in lower case). In Spark 2.4 this has been fixed and the functions are no longer case-sensitive.
    --- End diff --
    
    +1 for @felixcheung 's suggestion.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    **[Test build #90081 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90081/testReport)** for PR 21217 at commit [`662dd2e`](https://github.com/apache/spark/commit/662dd2e1dcd9de2ada87811e116b420f138cfa13).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Thanks @jamesthomp for you work. When we pick this up, I think we can still give the credit of the work to you.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    **[Test build #93181 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93181/testReport)** for PR 21217 at commit [`f31cd56`](https://github.com/apache/spark/commit/f31cd56e754cc88d597ccbd9086317985c890e12).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    **[Test build #93059 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93059/testReport)** for PR 21217 at commit [`6db67e4`](https://github.com/apache/spark/commit/6db67e4619a05357bfe693e315b732f6bca6e5ea).
     * This patch **fails due to an unknown error code, -9**.
     * This patch **does not merge cleanly**.
     * This patch adds the following public classes _(experimental)_:
      * `  class ImplicitTypeCasts(conf: SQLConf) extends TypeCoercionRule `
      * `case class StringToTimestampWithoutTimezone(child: Expression, timeZoneId: Option[String] = None)`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    @jamesthomp Could you document the behavior change in the migration guide? https://github.com/apache/spark/blame/master/docs/sql-programming-guide.md#L1802


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    @jamesthomp, mind updating this please?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90194/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIME...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21217#discussion_r186252704
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1812,6 +1812,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
       - Since Spark 2.4, creating a managed table with nonempty location is not allowed. An exception is thrown when attempting to create a managed table with nonempty location. To set `true` to `spark.sql.allowCreatingManagedTableUsingNonemptyLocation` restores the previous behavior. This option will be removed in Spark 3.0.
       - Since Spark 2.4, the type coercion rules can automatically promote the argument types of the variadic SQL functions (e.g., IN/COALESCE) to the widest common type, no matter how the input arguments order. In prior Spark versions, the promotion could fail in some specific orders (e.g., TimestampType, IntegerType and StringType) and throw an exception.
       - In version 2.3 and earlier, `to_utc_timestamp` and `from_utc_timestamp` respect the timezone in the input timestamp string, which breaks the assumption that the input timestamp is in a specific timezone. Therefore, these 2 functions can return unexpected results. In version 2.4 and later, this problem has been fixed. `to_utc_timestamp` and `from_utc_timestamp` will return null if the input timestamp string contains timezone. As an example, `from_utc_timestamp('2000-10-10 00:00:00', 'GMT+1')` will return `2000-10-10 01:00:00` in both Spark 2.3 and 2.4. However, `from_utc_timestamp('2000-10-10 00:00:00+00:00', 'GMT+1')`, assuming a local timezone of GMT+8, will return `2000-10-10 09:00:00` in Spark 2.3 but `null` in 2.4. For people who don't care about this problem and want to retain the previous behaivor to keep their query unchanged, you can set `spark.sql.function.rejectTimezoneInString` to false. This option will be removed in Spark 3.0 and should only be used as a tempora
 ry workaround.
    +  - In version 2.3, if `spark.sql.caseSensitive` is set to true, then the `CURRENT_DATE` and `CURRENT_TIMESTAMP` functions incorrectly became case-sensitive and would resolve to columns (unless typed in lower case). In Spark 2.4 this has been fixed and the functions are no longer case-sensitive.
    --- End diff --
    
    we should port this correctness issue to branch 2.2, 2.3


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    **[Test build #90194 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90194/testReport)** for PR 21217 at commit [`6db67e4`](https://github.com/apache/spark/commit/6db67e4619a05357bfe693e315b732f6bca6e5ea).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    @mgaido91 No problem. Please submit the PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    **[Test build #90138 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90138/testReport)** for PR 21217 at commit [`662dd2e`](https://github.com/apache/spark/commit/662dd2e1dcd9de2ada87811e116b420f138cfa13).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    **[Test build #90081 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90081/testReport)** for PR 21217 at commit [`662dd2e`](https://github.com/apache/spark/commit/662dd2e1dcd9de2ada87811e116b420f138cfa13).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIME...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21217#discussion_r203205223
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1857,6 +1857,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
       - In version 2.3 and earlier, Spark converts Parquet Hive tables by default but ignores table properties like `TBLPROPERTIES (parquet.compression 'NONE')`. This happens for ORC Hive table properties like `TBLPROPERTIES (orc.compress 'NONE')` in case of `spark.sql.hive.convertMetastoreOrc=true`, too. Since Spark 2.4, Spark respects Parquet/ORC specific table properties while converting Parquet/ORC Hive tables. As an example, `CREATE TABLE t(id int) STORED AS PARQUET TBLPROPERTIES (parquet.compression 'NONE')` would generate Snappy parquet files during insertion in Spark 2.3, and in Spark 2.4, the result would be uncompressed parquet files.
       - Since Spark 2.0, Spark converts Parquet Hive tables by default for better performance. Since Spark 2.4, Spark converts ORC Hive tables by default, too. It means Spark uses its own ORC support by default instead of Hive SerDe. As an example, `CREATE TABLE t(id int) STORED AS ORC` would be handled with Hive SerDe in Spark 2.3, and in Spark 2.4, it would be converted into Spark's ORC data source table and ORC vectorization would be applied. To set `false` to `spark.sql.hive.convertMetastoreOrc` restores the previous behavior.
       - In version 2.3 and earlier, CSV rows are considered as malformed if at least one column value in the row is malformed. CSV parser dropped such rows in the DROPMALFORMED mode or outputs an error in the FAILFAST mode. Since Spark 2.4, CSV row is considered as malformed only when it contains malformed column values requested from CSV datasource, other values can be ignored. As an example, CSV file contains the "id,name" header and one row "1234". In Spark 2.4, selection of the id column consists of a row with one column value 1234 but in Spark 2.3 and earlier it is empty in the DROPMALFORMED mode. To restore the previous behavior, set `spark.sql.csv.parser.columnPruning.enabled` to `false`.
    +  - In versions 2.2.1 and 2.3.0, if `spark.sql.caseSensitive` is set to true, then the `CURRENT_DATE` and `CURRENT_TIMESTAMP` functions incorrectly became case-sensitive and would resolve to columns (unless typed in lower case). In later versions, this has been fixed and the functions are no longer case-sensitive.
    --- End diff --
    
    Until now, 2.2.2 and 2.3.1 are released, too. Also, 2.3.2 voting is already started. 
    So, the range seems to be `2.2.1 ~ 2.3.2`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    **[Test build #93059 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93059/testReport)** for PR 21217 at commit [`6db67e4`](https://github.com/apache/spark/commit/6db67e4619a05357bfe693e315b732f6bca6e5ea).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90138/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    **[Test build #93181 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93181/testReport)** for PR 21217 at commit [`f31cd56`](https://github.com/apache/spark/commit/f31cd56e754cc88d597ccbd9086317985c890e12).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by jamesthomp <gi...@git.apache.org>.
Github user jamesthomp commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    @HyukjinKwon - I have resolved the conflict with the docs changes. Please let me know if any additional changes are required.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    if it's ok for you @viirya , I am submitting a PR for this then. I'll specify in the description that the credit should be given to @jamesthomp but this can be done also by the committer when eventually merging it I think. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    **[Test build #90194 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90194/testReport)** for PR 21217 at commit [`6db67e4`](https://github.com/apache/spark/commit/6db67e4619a05357bfe693e315b732f6bca6e5ea).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `  class ImplicitTypeCasts(conf: SQLConf) extends TypeCoercionRule `
      * `case class StringToTimestampWithoutTimezone(child: Expression, timeZoneId: Option[String] = None)`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by jamesthomp <gi...@git.apache.org>.
Github user jamesthomp commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    @gatorsmile - I have added a note to the migration guide.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Can anyone take over this then?
    
    cc @kiszk, @mgaido91 and @viirya as well FYI.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIME...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21217#discussion_r186150689
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1812,6 +1812,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
       - Since Spark 2.4, creating a managed table with nonempty location is not allowed. An exception is thrown when attempting to create a managed table with nonempty location. To set `true` to `spark.sql.allowCreatingManagedTableUsingNonemptyLocation` restores the previous behavior. This option will be removed in Spark 3.0.
       - Since Spark 2.4, the type coercion rules can automatically promote the argument types of the variadic SQL functions (e.g., IN/COALESCE) to the widest common type, no matter how the input arguments order. In prior Spark versions, the promotion could fail in some specific orders (e.g., TimestampType, IntegerType and StringType) and throw an exception.
       - In version 2.3 and earlier, `to_utc_timestamp` and `from_utc_timestamp` respect the timezone in the input timestamp string, which breaks the assumption that the input timestamp is in a specific timezone. Therefore, these 2 functions can return unexpected results. In version 2.4 and later, this problem has been fixed. `to_utc_timestamp` and `from_utc_timestamp` will return null if the input timestamp string contains timezone. As an example, `from_utc_timestamp('2000-10-10 00:00:00', 'GMT+1')` will return `2000-10-10 01:00:00` in both Spark 2.3 and 2.4. However, `from_utc_timestamp('2000-10-10 00:00:00+00:00', 'GMT+1')`, assuming a local timezone of GMT+8, will return `2000-10-10 09:00:00` in Spark 2.3 but `null` in 2.4. For people who don't care about this problem and want to retain the previous behaivor to keep their query unchanged, you can set `spark.sql.function.rejectTimezoneInString` to false. This option will be removed in Spark 3.0 and should only be used as a tempora
 ry workaround.
    +  - In version 2.3, if `spark.sql.caseSensitive` is set to true, then the `CURRENT_DATE` and `CURRENT_TIMESTAMP` functions incorrectly became case-sensitive and would resolve to columns (unless typed in lower case). In Spark 2.4 this has been fixed and the functions are no longer case-sensitive.
    --- End diff --
    
    Specifically, only 2.2.1 and 2.3.0 has this issue. 2.2.2 (if exists) / 2.3.1 / 2.4.0 should be fixed.
    How do you think about that, @gatorsmile ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Ping once more since this can be merged into Spark 3.0, @robert3005 .


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Hi, @robert3005 
    I know you deleted the branch, but could you try this once more?
    At this time, I can review and merge this if this is ready. Also, can we have a test in `AnalysisSuite` instead of `SQLQuerySuite`?
    
    cc @gatorsmile 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by jamesthomp <gi...@git.apache.org>.
Github user jamesthomp commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    I would be glad for @viirya or @mgaido91 to pick this up from me. It sounds like the remaining work is to move the test from `SQLQuerySuite` into `AnalysisSuite`, but I'm not certain how to easily do that as the form of the tests in `AnalysisSuite` is a bit different.
    Feel free to take my code and push it up onto another branch, thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Thanks for pinging me @HyukjinKwon . I can take it over too, let me know. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    @HyukjinKwon thanks for pinging me. I'd wait for others to take over this first, if no one does, I can do it later.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    **[Test build #90138 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90138/testReport)** for PR 21217 at commit [`662dd2e`](https://github.com/apache/spark/commit/662dd2e1dcd9de2ada87811e116b420f138cfa13).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90081/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21217: [SPARK-24151][SQL] Fix CURRENT_DATE, CURRENT_TIMESTAMP t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21217
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93181/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org