You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Max Gekk (Jira)" <ji...@apache.org> on 2023/11/01 07:40:00 UTC
[jira] [Assigned] (SPARK-45022) Provide context for dataset API errors
[ https://issues.apache.org/jira/browse/SPARK-45022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Max Gekk reassigned SPARK-45022:
--------------------------------
Assignee: Max Gekk
> Provide context for dataset API errors
> --------------------------------------
>
> Key: SPARK-45022
> URL: https://issues.apache.org/jira/browse/SPARK-45022
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 4.0.0
> Reporter: Peter Toth
> Assignee: Max Gekk
> Priority: Major
> Labels: pull-request-available
>
> SQL failures already provide nice error context when there is a failure:
> {noformat}
> org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
> == SQL(line 1, position 1) ==
> a / b
> ^^^^^
> at org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:201)
> at org.apache.spark.sql.errors.QueryExecutionErrors.divideByZeroError(QueryExecutionErrors.scala)
> ...
> {noformat}
> We could add a similar user friendly error context to Dataset APIs.
> E.g. consider the following Spark app SimpleApp.scala:
> {noformat}
> 1 import org.apache.spark.sql.SparkSession
> 2 import org.apache.spark.sql.functions._
> 3
> 4 object SimpleApp {
> 5 def main(args: Array[String]) {
> 6 val spark = SparkSession.builder.appName("Simple Application").config("spark.sql.ansi.enabled", true).getOrCreate()
> 7 import spark.implicits._
> 8
> 9 val c = col("a") / col("b")
> 10
> 11 Seq((1, 0)).toDF("a", "b").select(c).show()
> 12
> 13 spark.stop()
> 14 }
> 15 }
> {noformat}
> then the error context could be:
> {noformat}
> Exception in thread "main" org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
> == Dataset ==
> "div" was called from SimpleApp$.main(SimpleApp.scala:9)
> at org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:201)
> at org.apache.spark.sql.catalyst.expressions.DivModLike.eval(arithmetic.scala:672
> ...
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org