You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/03/28 14:18:27 UTC

[GitHub] [spark] gengliangwang opened a new pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

gengliangwang opened a new pull request #31982:
URL: https://github.com/apache/spark/pull/31982


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   Add a new SQL function `try_cast`. 
   `try_cast` is identical to  `AnsiCast` (or `Cast` when `spark.sql.ansi.enabled` is true), except it returns NULL instead of raising an error. 
   This expression has one major difference from `cast` with `spark.sql.ansi.enabled` as true: when the source value can't be stored in the target integral(Byte/Short/Int/Long) type, `try_cast` returns null instead of returning the low order bytes of the source value.
   Note that the result of `try_cast` is not affected by the configuration `spark.sql.ansi.enabled`.
   
   This is learned from Google BigQuery and Snowflake:
   https://docs.snowflake.com/en/sql-reference/functions/try_cast.html
   https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#safe_casting
   
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   This is an useful for the following scenarios:
   1. When ANSI mode is on, users can choose `try_cast` an alternative way to run SQL without errors for certain operations.
   2. When ANSI mode is off, users can use `try_cast` to get a more reasonable result for casting a value to an integral type: when an overflow error happens, `try_cast` returns null while `cast` returns the low order bytes of the source value.
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   Yes, adding a new function `try_cast`
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   Unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810849942


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41335/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810834893


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41335/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810686283


   Looks fine otherwise.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810253604


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136704/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r604612597



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TryCast.scala
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.sql.catalyst.expressions.codegen._
+import org.apache.spark.sql.catalyst.expressions.codegen.Block._
+import org.apache.spark.sql.types.DataType
+
+/**
+ * A special version of [[AnsiCast]]. It performs the same operation (i.e. converts a value of
+ * one data type into another data type), but returns a NULL value instead of raising an error
+ * when the conversion can not be performed.
+ *
+ * When cast from/to timezone related types, we need timeZoneId, which will be resolved with
+ * session local timezone by an analyzer [[ResolveTimeZone]].
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(expr AS type) - Casts the value `expr` to the target data type `type`. " +
+    "This expression is identical to CAST with configuration `spark.sql.ansi.enabled` as " +
+    "true, except it returns NULL instead of raising an error. Note that the behavior of this " +
+    "expression doesn't depend on configuration `spark.sql.ansi.enabled`.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_('10' as int);
+       10
+      > SELECT _FUNC_(1234567890123L as int);
+       null
+  """,
+  since = "3.2.0",
+  group = "conversion_funcs")

Review comment:
       Actually I plan to create docs for both CAST and TRY_CAST, even with ANSI CAST. Grouping them into one section.
   I have created https://issues.apache.org/jira/browse/SPARK-34917 and https://issues.apache.org/jira/browse/SPARK-34918




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r603932364



##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/TryCastSuite.scala
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import scala.reflect.ClassTag
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.DataType
+
+class TryCastSuite extends AnsiCastSuiteBase {
+  override protected def cast(v: Any, targetType: DataType, timeZoneId: Option[String]) = {
+    v match {
+      case lit: Expression => TryCast(lit, targetType, timeZoneId)
+      case _ => TryCast(Literal(v), targetType, timeZoneId)
+    }
+  }
+
+  override def isAlwaysNullable: Boolean = true
+
+  override protected def setConfigurationHint: String = s"set ${SQLConf.ANSI_ENABLED.key} as false"

Review comment:
       This should not be called, right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810536525


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136728/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810789707


   **[Test build #136752 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136752/testReport)** for PR 31982 at commit [`9266934`](https://github.com/apache/spark/commit/92669341d5eb849c853104c2a8052ec81fb2a4e5).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-811039881


   @maropu @cloud-fan Thanks for the review. Merging this to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810840886


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41335/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r603930570



##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
##########
@@ -885,6 +882,25 @@ abstract class AnsiCastSuiteBase extends CastSuiteBase {
     }
   }
 
+  protected def checkCastToNumericError(l: Literal, to: DataType, tryCastResult: Any): Unit = {

Review comment:
       Do we need the parameter `tryCastResult: Any`? It's always null.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r603757653



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
##########
@@ -4118,6 +4118,26 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
     }
   }
 
+  test("try_cast") {
+    Seq("true", "false").foreach { ansiEnabled =>
+      withSQLConf(SQLConf.ANSI_ENABLED.key -> ansiEnabled) {

Review comment:
       Yes it doesn't depend on ANSI option. We need to make sure that it doesn't throw exception when ansi=true




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r603939820



##########
File path: sql/core/src/test/resources/sql-tests/inputs/try_cast.sql
##########
@@ -0,0 +1,84 @@
+-- TRY_CAST string representing a valid fractional number to integral should truncate the number
+SELECT TRY_CAST('1.23' AS int);
+SELECT TRY_CAST('1.23' AS long);
+SELECT TRY_CAST('-4.56' AS int);
+SELECT TRY_CAST('-4.56' AS long);
+
+-- TRY_CAST string which are not numbers to integral should return null
+SELECT TRY_CAST('abc' AS int);
+SELECT TRY_CAST('abc' AS long);
+
+-- TRY_CAST string representing a very large number to integral should return null
+SELECT TRY_CAST('1234567890123' AS int);
+SELECT TRY_CAST('12345678901234567890123' AS long);
+
+-- TRY_CAST empty string to integral should return null
+SELECT TRY_CAST('' AS int);
+SELECT TRY_CAST('' AS long);
+
+-- TRY_CAST null to integral should return null
+SELECT TRY_CAST(NULL AS int);
+SELECT TRY_CAST(NULL AS long);
+
+-- TRY_CAST invalid decimal string to integral should return null
+SELECT TRY_CAST('123.a' AS int);
+SELECT TRY_CAST('123.a' AS long);
+
+-- '-2147483648' is the smallest int value
+SELECT TRY_CAST('-2147483648' AS int);
+SELECT TRY_CAST('-2147483649' AS int);
+
+-- '2147483647' is the largest int value
+SELECT TRY_CAST('2147483647' AS int);
+SELECT TRY_CAST('2147483648' AS int);
+
+-- '-9223372036854775808' is the smallest long value
+SELECT TRY_CAST('-9223372036854775808' AS long);
+SELECT TRY_CAST('-9223372036854775809' AS long);
+
+-- '9223372036854775807' is the largest long value
+SELECT TRY_CAST('9223372036854775807' AS long);
+SELECT TRY_CAST('9223372036854775808' AS long);
+

Review comment:
       The above overflow tests cover some tests in the beginning
   ```
   -- TRY_CAST string representing a very large number to integral should return null
   SELECT TRY_CAST('1234567890123' AS int);
   SELECT TRY_CAST('12345678901234567890123' AS long);
   ```
   
   shall we remove that test?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810536525


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136728/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r603013688



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TryCast.scala
##########
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.sql.catalyst.expressions.codegen._
+import org.apache.spark.sql.catalyst.expressions.codegen.Block._
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.DataType
+
+/**
+ * A special version of [[AnsiCast]]. It performs the same operation (i.e. converts a value of
+ * one data type into another data type), but returns a NULL value instead of raising an error
+ * when the conversion can not be performed.
+ *
+ * When cast from/to timezone related types, we need timeZoneId, which will be resolved with
+ * session local timezone by an analyzer [[ResolveTimeZone]].
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(expr AS type) - Casts the value `expr` to the target data type `type`. " +
+    "This expression is identical to CAST with `spark.sql.ansi.enabled` as true, " +
+    "except it returns NULL instead of raising an error. " +
+    "This expression has one major difference from `cast` with `spark.sql.ansi.enabled` as true: " +
+    "when the source value can't be stored in the target integral(Byte/Short/Int/Long) type, " +
+    "`try_cast` returns null instead of returning the low order bytes of the source value.",

Review comment:
       nit: `try_cast` => `_FUNC_`?

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TryCast.scala
##########
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.sql.catalyst.expressions.codegen._
+import org.apache.spark.sql.catalyst.expressions.codegen.Block._
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.DataType
+
+/**
+ * A special version of [[AnsiCast]]. It performs the same operation (i.e. converts a value of
+ * one data type into another data type), but returns a NULL value instead of raising an error
+ * when the conversion can not be performed.
+ *
+ * When cast from/to timezone related types, we need timeZoneId, which will be resolved with
+ * session local timezone by an analyzer [[ResolveTimeZone]].
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(expr AS type) - Casts the value `expr` to the target data type `type`. " +
+    "This expression is identical to CAST with `spark.sql.ansi.enabled` as true, " +
+    "except it returns NULL instead of raising an error. " +
+    "This expression has one major difference from `cast` with `spark.sql.ansi.enabled` as true: " +

Review comment:
       ditto

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##########
@@ -1610,6 +1610,17 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with SQLConfHelper with Logg
     cast
   }
 
+  /**
+   * Create a [[TryCast]] expression.
+   */
+  override def visitTryCast(ctx: TryCastContext): Expression = withOrigin(ctx) {

Review comment:
       `visitCast` and `visitTryCast` are similar between each other, so how about merging their definition in `SqlBase.sql`?
   ```
       | cast=(CAST | TRY_CAST) '(' expression AS dataType ')'                                                      #cast
   ```

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TryCast.scala
##########
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.sql.catalyst.expressions.codegen._
+import org.apache.spark.sql.catalyst.expressions.codegen.Block._
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.DataType
+
+/**
+ * A special version of [[AnsiCast]]. It performs the same operation (i.e. converts a value of
+ * one data type into another data type), but returns a NULL value instead of raising an error
+ * when the conversion can not be performed.
+ *
+ * When cast from/to timezone related types, we need timeZoneId, which will be resolved with
+ * session local timezone by an analyzer [[ResolveTimeZone]].
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(expr AS type) - Casts the value `expr` to the target data type `type`. " +
+    "This expression is identical to CAST with `spark.sql.ansi.enabled` as true, " +

Review comment:
       nti: `spark.sql.ansi.enabled` -> `SQLConf.ANSI_ENABLED.key`?

##########
File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
##########
@@ -4118,6 +4118,26 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
     }
   }
 
+  test("try_cast") {

Review comment:
       Could we move this test into `SQLQueryTestSuite`?

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TryCast.scala
##########
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.sql.catalyst.expressions.codegen._
+import org.apache.spark.sql.catalyst.expressions.codegen.Block._
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.DataType
+
+/**
+ * A special version of [[AnsiCast]]. It performs the same operation (i.e. converts a value of
+ * one data type into another data type), but returns a NULL value instead of raising an error
+ * when the conversion can not be performed.
+ *
+ * When cast from/to timezone related types, we need timeZoneId, which will be resolved with
+ * session local timezone by an analyzer [[ResolveTimeZone]].
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(expr AS type) - Casts the value `expr` to the target data type `type`. " +
+    "This expression is identical to CAST with `spark.sql.ansi.enabled` as true, " +

Review comment:
       In the first statement, how about explicitly saying that this behaviour follow the ANSI rule?

##########
File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
##########
@@ -4118,6 +4118,26 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
     }
   }
 
+  test("try_cast") {
+    Seq("true", "false").foreach { ansiEnabled =>
+      withSQLConf(SQLConf.ANSI_ENABLED.key -> ansiEnabled) {

Review comment:
       It looks this expr itself does not depend on the ansi option, but we need tests for ansi=true?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-808914296


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41197/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810968314


   **[Test build #136752 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136752/testReport)** for PR 31982 at commit [`9266934`](https://github.com/apache/spark/commit/92669341d5eb849c853104c2a8052ec81fb2a4e5).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r603757223



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TryCast.scala
##########
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.sql.catalyst.expressions.codegen._
+import org.apache.spark.sql.catalyst.expressions.codegen.Block._
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.DataType
+
+/**
+ * A special version of [[AnsiCast]]. It performs the same operation (i.e. converts a value of
+ * one data type into another data type), but returns a NULL value instead of raising an error
+ * when the conversion can not be performed.
+ *
+ * When cast from/to timezone related types, we need timeZoneId, which will be resolved with
+ * session local timezone by an analyzer [[ResolveTimeZone]].
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(expr AS type) - Casts the value `expr` to the target data type `type`. " +
+    "This expression is identical to CAST with `spark.sql.ansi.enabled` as true, " +

Review comment:
       Do you mean the behavior of `castCast`?  I thought that saying 
   ```
   identical to CAST with `spark.sql.ansi.enabled` as true
   ```
   contains that. It is tricky to write the doc here. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r604532860



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TryCast.scala
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.sql.catalyst.expressions.codegen._
+import org.apache.spark.sql.catalyst.expressions.codegen.Block._
+import org.apache.spark.sql.types.DataType
+
+/**
+ * A special version of [[AnsiCast]]. It performs the same operation (i.e. converts a value of
+ * one data type into another data type), but returns a NULL value instead of raising an error
+ * when the conversion can not be performed.
+ *
+ * When cast from/to timezone related types, we need timeZoneId, which will be resolved with
+ * session local timezone by an analyzer [[ResolveTimeZone]].
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(expr AS type) - Casts the value `expr` to the target data type `type`. " +
+    "This expression is identical to CAST with configuration `spark.sql.ansi.enabled` as " +
+    "true, except it returns NULL instead of raising an error. Note that the behavior of this " +
+    "expression doesn't depend on configuration `spark.sql.ansi.enabled`.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_('10' as int);
+       10
+      > SELECT _FUNC_(1234567890123L as int);
+       null
+  """,
+  since = "3.2.0",
+  group = "conversion_funcs")

Review comment:
       Since this expr is not registered in `FunctionRegistry`, this description is not automatically added in the SQL doc via the Python script. So, I think its better to document this new feature in the SQL syntax doc manually.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810372812


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41309/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810969571


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136752/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r604127270



##########
File path: sql/core/src/test/resources/sql-tests/inputs/try_cast.sql
##########
@@ -0,0 +1,84 @@
+-- TRY_CAST string representing a valid fractional number to integral should truncate the number
+SELECT TRY_CAST('1.23' AS int);
+SELECT TRY_CAST('1.23' AS long);
+SELECT TRY_CAST('-4.56' AS int);
+SELECT TRY_CAST('-4.56' AS long);
+
+-- TRY_CAST string which are not numbers to integral should return null
+SELECT TRY_CAST('abc' AS int);
+SELECT TRY_CAST('abc' AS long);
+
+-- TRY_CAST string representing a very large number to integral should return null
+SELECT TRY_CAST('1234567890123' AS int);
+SELECT TRY_CAST('12345678901234567890123' AS long);
+
+-- TRY_CAST empty string to integral should return null
+SELECT TRY_CAST('' AS int);
+SELECT TRY_CAST('' AS long);
+
+-- TRY_CAST null to integral should return null
+SELECT TRY_CAST(NULL AS int);
+SELECT TRY_CAST(NULL AS long);
+
+-- TRY_CAST invalid decimal string to integral should return null
+SELECT TRY_CAST('123.a' AS int);
+SELECT TRY_CAST('123.a' AS long);
+
+-- '-2147483648' is the smallest int value
+SELECT TRY_CAST('-2147483648' AS int);
+SELECT TRY_CAST('-2147483649' AS int);
+
+-- '2147483647' is the largest int value
+SELECT TRY_CAST('2147483647' AS int);
+SELECT TRY_CAST('2147483648' AS int);
+
+-- '-9223372036854775808' is the smallest long value
+SELECT TRY_CAST('-9223372036854775808' AS long);
+SELECT TRY_CAST('-9223372036854775809' AS long);
+
+-- '9223372036854775807' is the largest long value
+SELECT TRY_CAST('9223372036854775807' AS long);
+SELECT TRY_CAST('9223372036854775808' AS long);
+

Review comment:
       Actually, this file is copied from `cast.sql`. This should explain many comments below.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang closed pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
gengliangwang closed pull request #31982:
URL: https://github.com/apache/spark/pull/31982


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810067563


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41285/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810323453


   **[Test build #136728 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136728/testReport)** for PR 31982 at commit [`fa3f2a7`](https://github.com/apache/spark/commit/fa3f2a77732e788c71fa3d695b7eefe2aa0bfad5).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r604621607



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TryCast.scala
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.sql.catalyst.expressions.codegen._
+import org.apache.spark.sql.catalyst.expressions.codegen.Block._
+import org.apache.spark.sql.types.DataType
+
+/**
+ * A special version of [[AnsiCast]]. It performs the same operation (i.e. converts a value of
+ * one data type into another data type), but returns a NULL value instead of raising an error
+ * when the conversion can not be performed.
+ *
+ * When cast from/to timezone related types, we need timeZoneId, which will be resolved with
+ * session local timezone by an analyzer [[ResolveTimeZone]].
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(expr AS type) - Casts the value `expr` to the target data type `type`. " +
+    "This expression is identical to CAST with configuration `spark.sql.ansi.enabled` as " +
+    "true, except it returns NULL instead of raising an error. Note that the behavior of this " +
+    "expression doesn't depend on configuration `spark.sql.ansi.enabled`.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_('10' as int);
+       10
+      > SELECT _FUNC_(1234567890123L as int);
+       null
+  """,
+  since = "3.2.0",
+  group = "conversion_funcs")

Review comment:
       Nice! Thanks for letting me know.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r604129971



##########
File path: sql/core/src/test/resources/sql-tests/inputs/try_cast.sql
##########
@@ -0,0 +1,84 @@
+-- TRY_CAST string representing a valid fractional number to integral should truncate the number
+SELECT TRY_CAST('1.23' AS int);
+SELECT TRY_CAST('1.23' AS long);
+SELECT TRY_CAST('-4.56' AS int);
+SELECT TRY_CAST('-4.56' AS long);
+
+-- TRY_CAST string which are not numbers to integral should return null
+SELECT TRY_CAST('abc' AS int);
+SELECT TRY_CAST('abc' AS long);
+
+-- TRY_CAST string representing a very large number to integral should return null
+SELECT TRY_CAST('1234567890123' AS int);
+SELECT TRY_CAST('12345678901234567890123' AS long);
+
+-- TRY_CAST empty string to integral should return null
+SELECT TRY_CAST('' AS int);
+SELECT TRY_CAST('' AS long);
+
+-- TRY_CAST null to integral should return null
+SELECT TRY_CAST(NULL AS int);
+SELECT TRY_CAST(NULL AS long);
+
+-- TRY_CAST invalid decimal string to integral should return null
+SELECT TRY_CAST('123.a' AS int);
+SELECT TRY_CAST('123.a' AS long);
+
+-- '-2147483648' is the smallest int value
+SELECT TRY_CAST('-2147483648' AS int);
+SELECT TRY_CAST('-2147483649' AS int);
+
+-- '2147483647' is the largest int value
+SELECT TRY_CAST('2147483647' AS int);
+SELECT TRY_CAST('2147483648' AS int);
+
+-- '-9223372036854775808' is the smallest long value
+SELECT TRY_CAST('-9223372036854775808' AS long);
+SELECT TRY_CAST('-9223372036854775809' AS long);
+
+-- '9223372036854775807' is the largest long value
+SELECT TRY_CAST('9223372036854775807' AS long);
+SELECT TRY_CAST('9223372036854775808' AS long);
+

Review comment:
       Let me add more negative cases in this file.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-808912537


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136615/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-808910364


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41197/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810074103


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41285/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810072315


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41285/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-808914296


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41197/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810247988


   **[Test build #136704 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136704/testReport)** for PR 31982 at commit [`fbb25ef`](https://github.com/apache/spark/commit/fbb25ef67be45b9d71b4aed9f7ffe0163d400f5a).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-808905176


   **[Test build #136615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136615/testReport)** for PR 31982 at commit [`d8df8e6`](https://github.com/apache/spark/commit/d8df8e6602f7dcdd78e237a22cff313eb2df44dd).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810074103


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41285/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-808905176


   **[Test build #136615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136615/testReport)** for PR 31982 at commit [`d8df8e6`](https://github.com/apache/spark/commit/d8df8e6602f7dcdd78e237a22cff313eb2df44dd).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r603757476



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
##########
@@ -4118,6 +4118,26 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
     }
   }
 
+  test("try_cast") {

Review comment:
       make sense.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-808912537


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136615/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r604077372



##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/TryCastSuite.scala
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import scala.reflect.ClassTag
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.DataType
+
+class TryCastSuite extends AnsiCastSuiteBase {
+  override protected def cast(v: Any, targetType: DataType, timeZoneId: Option[String]) = {
+    v match {
+      case lit: Expression => TryCast(lit, targetType, timeZoneId)
+      case _ => TryCast(Literal(v), targetType, timeZoneId)
+    }
+  }
+
+  override def isAlwaysNullable: Boolean = true
+
+  override protected def setConfigurationHint: String = s"set ${SQLConf.ANSI_ENABLED.key} as false"

Review comment:
       Yes, it is. I should update the error message of `TryCast`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r603938487



##########
File path: sql/core/src/test/resources/sql-tests/inputs/ansi/try_cast.sql
##########
@@ -0,0 +1 @@
+--IMPORT try_cast.sql

Review comment:
       I feel a bit overkill to run all the end-to-end tests again with ANSI mode. The code is very clear that `TRY_CAST` does not rely on the ANSI flag, and it's good enough to write some UTs to verify it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810362836


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41309/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810253604


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136704/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-808908275


   **[Test build #136615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136615/testReport)** for PR 31982 at commit [`d8df8e6`](https://github.com/apache/spark/commit/d8df8e6602f7dcdd78e237a22cff313eb2df44dd).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `case class TryCast(child: Expression, dataType: DataType, timeZoneId: Option[String] = None)`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810969571


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136752/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810789707


   **[Test build #136752 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136752/testReport)** for PR 31982 at commit [`9266934`](https://github.com/apache/spark/commit/92669341d5eb849c853104c2a8052ec81fb2a4e5).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810372812


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41309/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810372785


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41309/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810323453


   **[Test build #136728 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136728/testReport)** for PR 31982 at commit [`fa3f2a7`](https://github.com/apache/spark/commit/fa3f2a77732e788c71fa3d695b7eefe2aa0bfad5).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810849942


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41335/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810034405


   **[Test build #136704 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136704/testReport)** for PR 31982 at commit [`fbb25ef`](https://github.com/apache/spark/commit/fbb25ef67be45b9d71b4aed9f7ffe0163d400f5a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-808914289


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41197/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r604140574



##########
File path: sql/core/src/test/resources/sql-tests/inputs/ansi/try_cast.sql
##########
@@ -0,0 +1 @@
+--IMPORT try_cast.sql

Review comment:
       OK, let's delete it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r603941072



##########
File path: sql/core/src/test/resources/sql-tests/inputs/try_cast.sql
##########
@@ -0,0 +1,84 @@
+-- TRY_CAST string representing a valid fractional number to integral should truncate the number
+SELECT TRY_CAST('1.23' AS int);
+SELECT TRY_CAST('1.23' AS long);
+SELECT TRY_CAST('-4.56' AS int);
+SELECT TRY_CAST('-4.56' AS long);
+
+-- TRY_CAST string which are not numbers to integral should return null
+SELECT TRY_CAST('abc' AS int);
+SELECT TRY_CAST('abc' AS long);
+
+-- TRY_CAST string representing a very large number to integral should return null
+SELECT TRY_CAST('1234567890123' AS int);
+SELECT TRY_CAST('12345678901234567890123' AS long);
+
+-- TRY_CAST empty string to integral should return null
+SELECT TRY_CAST('' AS int);
+SELECT TRY_CAST('' AS long);
+
+-- TRY_CAST null to integral should return null
+SELECT TRY_CAST(NULL AS int);
+SELECT TRY_CAST(NULL AS long);
+
+-- TRY_CAST invalid decimal string to integral should return null
+SELECT TRY_CAST('123.a' AS int);
+SELECT TRY_CAST('123.a' AS long);
+
+-- '-2147483648' is the smallest int value
+SELECT TRY_CAST('-2147483648' AS int);
+SELECT TRY_CAST('-2147483649' AS int);
+
+-- '2147483647' is the largest int value
+SELECT TRY_CAST('2147483647' AS int);
+SELECT TRY_CAST('2147483648' AS int);
+
+-- '-9223372036854775808' is the smallest long value
+SELECT TRY_CAST('-9223372036854775808' AS long);
+SELECT TRY_CAST('-9223372036854775809' AS long);
+
+-- '9223372036854775807' is the largest long value
+SELECT TRY_CAST('9223372036854775807' AS long);
+SELECT TRY_CAST('9223372036854775808' AS long);
+
+-- TRY_CAST string to its binary representation
+SELECT HEX(TRY_CAST('abc' AS binary));
+
+-- TRY_CAST integral values to their corresponding binary representation
+SELECT HEX(TRY_CAST(TRY_CAST(123 AS byte) AS binary));
+SELECT HEX(TRY_CAST(TRY_CAST(-123 AS byte) AS binary));
+SELECT HEX(TRY_CAST(123S AS binary));
+SELECT HEX(TRY_CAST(-123S AS binary));
+SELECT HEX(TRY_CAST(123 AS binary));
+SELECT HEX(TRY_CAST(-123 AS binary));
+SELECT HEX(TRY_CAST(123L AS binary));
+SELECT HEX(TRY_CAST(-123L AS binary));

Review comment:
       are the HEX tests above related to TRY_CAST?

##########
File path: sql/core/src/test/resources/sql-tests/inputs/try_cast.sql
##########
@@ -0,0 +1,84 @@
+-- TRY_CAST string representing a valid fractional number to integral should truncate the number
+SELECT TRY_CAST('1.23' AS int);
+SELECT TRY_CAST('1.23' AS long);
+SELECT TRY_CAST('-4.56' AS int);
+SELECT TRY_CAST('-4.56' AS long);
+
+-- TRY_CAST string which are not numbers to integral should return null
+SELECT TRY_CAST('abc' AS int);
+SELECT TRY_CAST('abc' AS long);
+
+-- TRY_CAST string representing a very large number to integral should return null
+SELECT TRY_CAST('1234567890123' AS int);
+SELECT TRY_CAST('12345678901234567890123' AS long);
+
+-- TRY_CAST empty string to integral should return null
+SELECT TRY_CAST('' AS int);
+SELECT TRY_CAST('' AS long);
+
+-- TRY_CAST null to integral should return null
+SELECT TRY_CAST(NULL AS int);
+SELECT TRY_CAST(NULL AS long);
+
+-- TRY_CAST invalid decimal string to integral should return null
+SELECT TRY_CAST('123.a' AS int);
+SELECT TRY_CAST('123.a' AS long);
+
+-- '-2147483648' is the smallest int value
+SELECT TRY_CAST('-2147483648' AS int);
+SELECT TRY_CAST('-2147483649' AS int);
+
+-- '2147483647' is the largest int value
+SELECT TRY_CAST('2147483647' AS int);
+SELECT TRY_CAST('2147483648' AS int);
+
+-- '-9223372036854775808' is the smallest long value
+SELECT TRY_CAST('-9223372036854775808' AS long);
+SELECT TRY_CAST('-9223372036854775809' AS long);
+
+-- '9223372036854775807' is the largest long value
+SELECT TRY_CAST('9223372036854775807' AS long);
+SELECT TRY_CAST('9223372036854775808' AS long);
+
+-- TRY_CAST string to its binary representation
+SELECT HEX(TRY_CAST('abc' AS binary));
+
+-- TRY_CAST integral values to their corresponding binary representation
+SELECT HEX(TRY_CAST(TRY_CAST(123 AS byte) AS binary));
+SELECT HEX(TRY_CAST(TRY_CAST(-123 AS byte) AS binary));
+SELECT HEX(TRY_CAST(123S AS binary));
+SELECT HEX(TRY_CAST(-123S AS binary));
+SELECT HEX(TRY_CAST(123 AS binary));
+SELECT HEX(TRY_CAST(-123 AS binary));
+SELECT HEX(TRY_CAST(123L AS binary));
+SELECT HEX(TRY_CAST(-123L AS binary));
+
+-- TRY_CAST string to interval and interval to string
+SELECT TRY_CAST('interval 3 month 1 hour' AS interval);
+SELECT TRY_CAST(interval 3 month 1 hour AS string);

Review comment:
       shall we test some negative cases that return null?

##########
File path: sql/core/src/test/resources/sql-tests/inputs/try_cast.sql
##########
@@ -0,0 +1,84 @@
+-- TRY_CAST string representing a valid fractional number to integral should truncate the number
+SELECT TRY_CAST('1.23' AS int);
+SELECT TRY_CAST('1.23' AS long);
+SELECT TRY_CAST('-4.56' AS int);
+SELECT TRY_CAST('-4.56' AS long);
+
+-- TRY_CAST string which are not numbers to integral should return null
+SELECT TRY_CAST('abc' AS int);
+SELECT TRY_CAST('abc' AS long);
+
+-- TRY_CAST string representing a very large number to integral should return null
+SELECT TRY_CAST('1234567890123' AS int);
+SELECT TRY_CAST('12345678901234567890123' AS long);
+
+-- TRY_CAST empty string to integral should return null
+SELECT TRY_CAST('' AS int);
+SELECT TRY_CAST('' AS long);
+
+-- TRY_CAST null to integral should return null
+SELECT TRY_CAST(NULL AS int);
+SELECT TRY_CAST(NULL AS long);
+
+-- TRY_CAST invalid decimal string to integral should return null
+SELECT TRY_CAST('123.a' AS int);
+SELECT TRY_CAST('123.a' AS long);
+
+-- '-2147483648' is the smallest int value
+SELECT TRY_CAST('-2147483648' AS int);
+SELECT TRY_CAST('-2147483649' AS int);
+
+-- '2147483647' is the largest int value
+SELECT TRY_CAST('2147483647' AS int);
+SELECT TRY_CAST('2147483648' AS int);
+
+-- '-9223372036854775808' is the smallest long value
+SELECT TRY_CAST('-9223372036854775808' AS long);
+SELECT TRY_CAST('-9223372036854775809' AS long);
+
+-- '9223372036854775807' is the largest long value
+SELECT TRY_CAST('9223372036854775807' AS long);
+SELECT TRY_CAST('9223372036854775808' AS long);
+
+-- TRY_CAST string to its binary representation
+SELECT HEX(TRY_CAST('abc' AS binary));
+
+-- TRY_CAST integral values to their corresponding binary representation
+SELECT HEX(TRY_CAST(TRY_CAST(123 AS byte) AS binary));
+SELECT HEX(TRY_CAST(TRY_CAST(-123 AS byte) AS binary));
+SELECT HEX(TRY_CAST(123S AS binary));
+SELECT HEX(TRY_CAST(-123S AS binary));
+SELECT HEX(TRY_CAST(123 AS binary));
+SELECT HEX(TRY_CAST(-123 AS binary));
+SELECT HEX(TRY_CAST(123L AS binary));
+SELECT HEX(TRY_CAST(-123L AS binary));
+
+-- TRY_CAST string to interval and interval to string
+SELECT TRY_CAST('interval 3 month 1 hour' AS interval);
+SELECT TRY_CAST(interval 3 month 1 hour AS string);
+
+-- trim string before TRY_CAST to numeric
+select TRY_CAST(' 1' as tinyint);
+select TRY_CAST(' 1\t' as tinyint);
+select TRY_CAST(' 1' as smallint);
+select TRY_CAST(' 1' as INT);
+select TRY_CAST(' 1' as bigint);
+select TRY_CAST(' 1' as float);
+select TRY_CAST(' 1 ' as DOUBLE);
+select TRY_CAST('1.0 ' as DEC);
+select TRY_CAST('1中文' as tinyint);
+select TRY_CAST('1中文' as smallint);
+select TRY_CAST('1中文' as INT);
+select TRY_CAST('中文1' as bigint);
+select TRY_CAST('1中文' as bigint);
+
+-- trim string before TRY_CAST to boolean
+select TRY_CAST('\t\t true \n\r ' as boolean);
+select TRY_CAST('\t\n false \t\r' as boolean);
+select TRY_CAST('\t\n xyz \t\r' as boolean);

Review comment:
       The trim tests above are not related to TRY_CAST either.

##########
File path: sql/core/src/test/resources/sql-tests/inputs/try_cast.sql
##########
@@ -0,0 +1,84 @@
+-- TRY_CAST string representing a valid fractional number to integral should truncate the number
+SELECT TRY_CAST('1.23' AS int);
+SELECT TRY_CAST('1.23' AS long);
+SELECT TRY_CAST('-4.56' AS int);
+SELECT TRY_CAST('-4.56' AS long);
+
+-- TRY_CAST string which are not numbers to integral should return null
+SELECT TRY_CAST('abc' AS int);
+SELECT TRY_CAST('abc' AS long);
+
+-- TRY_CAST string representing a very large number to integral should return null
+SELECT TRY_CAST('1234567890123' AS int);
+SELECT TRY_CAST('12345678901234567890123' AS long);
+
+-- TRY_CAST empty string to integral should return null
+SELECT TRY_CAST('' AS int);
+SELECT TRY_CAST('' AS long);
+
+-- TRY_CAST null to integral should return null
+SELECT TRY_CAST(NULL AS int);
+SELECT TRY_CAST(NULL AS long);
+
+-- TRY_CAST invalid decimal string to integral should return null
+SELECT TRY_CAST('123.a' AS int);
+SELECT TRY_CAST('123.a' AS long);
+
+-- '-2147483648' is the smallest int value
+SELECT TRY_CAST('-2147483648' AS int);
+SELECT TRY_CAST('-2147483649' AS int);
+
+-- '2147483647' is the largest int value
+SELECT TRY_CAST('2147483647' AS int);
+SELECT TRY_CAST('2147483648' AS int);
+
+-- '-9223372036854775808' is the smallest long value
+SELECT TRY_CAST('-9223372036854775808' AS long);
+SELECT TRY_CAST('-9223372036854775809' AS long);
+
+-- '9223372036854775807' is the largest long value
+SELECT TRY_CAST('9223372036854775807' AS long);
+SELECT TRY_CAST('9223372036854775808' AS long);
+
+-- TRY_CAST string to its binary representation
+SELECT HEX(TRY_CAST('abc' AS binary));
+
+-- TRY_CAST integral values to their corresponding binary representation
+SELECT HEX(TRY_CAST(TRY_CAST(123 AS byte) AS binary));
+SELECT HEX(TRY_CAST(TRY_CAST(-123 AS byte) AS binary));
+SELECT HEX(TRY_CAST(123S AS binary));
+SELECT HEX(TRY_CAST(-123S AS binary));
+SELECT HEX(TRY_CAST(123 AS binary));
+SELECT HEX(TRY_CAST(-123 AS binary));
+SELECT HEX(TRY_CAST(123L AS binary));
+SELECT HEX(TRY_CAST(-123L AS binary));
+
+-- TRY_CAST string to interval and interval to string
+SELECT TRY_CAST('interval 3 month 1 hour' AS interval);
+SELECT TRY_CAST(interval 3 month 1 hour AS string);
+
+-- trim string before TRY_CAST to numeric
+select TRY_CAST(' 1' as tinyint);
+select TRY_CAST(' 1\t' as tinyint);
+select TRY_CAST(' 1' as smallint);
+select TRY_CAST(' 1' as INT);
+select TRY_CAST(' 1' as bigint);
+select TRY_CAST(' 1' as float);
+select TRY_CAST(' 1 ' as DOUBLE);
+select TRY_CAST('1.0 ' as DEC);
+select TRY_CAST('1中文' as tinyint);
+select TRY_CAST('1中文' as smallint);
+select TRY_CAST('1中文' as INT);
+select TRY_CAST('中文1' as bigint);
+select TRY_CAST('1中文' as bigint);
+
+-- trim string before TRY_CAST to boolean
+select TRY_CAST('\t\t true \n\r ' as boolean);
+select TRY_CAST('\t\n false \t\r' as boolean);
+select TRY_CAST('\t\n xyz \t\r' as boolean);
+
+SELECT TRY_CAST("2021-01-01" AS date);
+SELECT TRY_CAST("2021-101-01" AS date);
+
+SELECT TRY_CAST("2021-01-01 00:00:00" AS timestamp);
+SELECT TRY_CAST("2021-101-01 00:00:00" AS timestamp);

Review comment:
       can we also add some negative tests for casting to boolean?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810034405


   **[Test build #136704 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136704/testReport)** for PR 31982 at commit [`fbb25ef`](https://github.com/apache/spark/commit/fbb25ef67be45b9d71b4aed9f7ffe0163d400f5a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810535337


   **[Test build #136728 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136728/testReport)** for PR 31982 at commit [`fa3f2a7`](https://github.com/apache/spark/commit/fa3f2a77732e788c71fa3d695b7eefe2aa0bfad5).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org