You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "nija-at (via GitHub)" <gi...@apache.org> on 2023/11/23 16:57:56 UTC

[PR] [SPARK-46074][CONNECT][SCALA] Insufficient details in error message on UDF failure [spark]

nija-at opened a new pull request, #43983:
URL: https://github.com/apache/spark/pull/43983

   ### What changes were proposed in this pull request?
   
   Update the error message for 'FAILED_EXECUTE_UDF' with the underlying error
   message.
   
   ### Why are the changes needed?
   
   The Spark Connect client does not receive the underlying cause for a UDF failure.
   This means that a user needs to go into the driver logs to identify the cause for
   failure.
   
   Update the error message so that the underlying exception's message is included.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. This changes the error message that the user sees when a UDF fails. A new error
   parameter is added but the SQL state and existing parameters are unchanged and should
   cause no regressions.
   
   The error message prior to this change:
   
   ```
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 1 times, most recent failure: Lost task 3.0 in stage 0.0 (TID 3) (192.168.188.21 executor driver): org.apache.spark.SparkException: [FAILED_EXECUTE_UDF] Failed to execute user defined function (` (cmd2$Helper$$Lambda$2170/0x000000f001d23000)`: (int) => int). SQLSTATE: 39000
   ```
   
   Sample of the new error message:
   
   ```
   org.apache.spark.SparkException: [FAILED_EXECUTE_UDF] User defined function (` (cmd2$Helper$$Lambda$2422/0x0000007001ec1a10)`: (int) => int) failed due to: java.lang.NoClassDefFoundError: com/nija/test/MyClass. SQLSTATE: 39000
   ```
   
   ### How was this patch tested?
   
   Tested manually by running a [local connect server] and [connect client REPL]
   
   [local connect server]: https://github.com/apache/spark/blob/master/connector/connect/bin/spark-connect-shell
   [connect client REPL]: https://github.com/apache/spark/blob/master/connector/connect/bin/spark-connect-scala-client
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46074][CONNECT][SCALA] Insufficient details in error message on UDF failure [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43983:
URL: https://github.com/apache/spark/pull/43983#issuecomment-1824992475

   FYI: https://github.com/nija-at/spark/actions/runs/6969839081/job/18966590594#step:10:6723
   ```
   [info] - better error message for NPE *** FAILED *** (3 milliseconds)
   [info]   "[FAILED_EXECUTE_UDF] User defined function (`ScalaUDFSuite$$Lambda$10456/0x00007f715148ace0`: (string) => string) failed due to: java.lang.NullPointerException: Cannot invoke "String.toLowerCase(java.util.Locale)" because "s" is null. SQLSTATE: 39000" did not contain "Failed to execute user defined function" (ScalaUDFSuite.scala:54)
   [info]   org.scalatest.exceptions.TestFailedException:
   [info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
   [info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
   [info]   at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231)
   [info]   at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295)
   [info]   at org.apache.spark.sql.catalyst.expressions.ScalaUDFSuite.$anonfun$new$6(ScalaUDFSuite.scala:54)
   [info]   at org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)
   [info]   at org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)
   [info]   at org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)
   [info]   at org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)
   [info]   at org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)
   [info]   at org.apache.spark.SparkFunSuite.$anonfun$test$2(SparkFunSuite.scala:155)
   [info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
   [info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
   [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
   [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
   ```
   
   cc @MaxGekk and @srielau 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46074][CONNECT][SCALA] Insufficient details in error message on UDF failure [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43983:
URL: https://github.com/apache/spark/pull/43983#issuecomment-1830996572

   Thanks for letting me know, and the fix @LuciferYang !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46074][CONNECT][SCALA] Insufficient details in error message on UDF failure [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #43983: [SPARK-46074][CONNECT][SCALA] Insufficient details in error message on UDF failure
URL: https://github.com/apache/spark/pull/43983


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46074][CONNECT][SCALA] Insufficient details in error message on UDF failure [spark]

Posted by "nija-at (via GitHub)" <gi...@apache.org>.
nija-at commented on PR #43983:
URL: https://github.com/apache/spark/pull/43983#issuecomment-1826877511

   @HyukjinKwon - I've fixed all the failing tests. Can you take another look and merge if happy?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46074][CONNECT][SCALA] Insufficient details in error message on UDF failure [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43983:
URL: https://github.com/apache/spark/pull/43983#issuecomment-1826942313

   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org