You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "ueshin (via GitHub)" <gi...@apache.org> on 2023/05/11 23:11:29 UTC

[GitHub] [spark] ueshin opened a new pull request, #41148: [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default

ueshin opened a new pull request, #41148:
URL: https://github.com/apache/spark/pull/41148

   ### What changes were proposed in this pull request?
   
   This is a follow-up of #40575.
   
   Disables JVM stack trace by default.
   
   ```py
   % ./bin/pyspark --remote local
   ...
   >>> spark.conf.set("spark.sql.ansi.enabled", True)
   >>> spark.sql('select 1/0').show()
   ...
   Traceback (most recent call last):
   ...
   pyspark.errors.exceptions.connect.ArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
   == SQL(line 1, position 8) ==
   select 1/0
          ^^^
   
   >>> 
   >>> spark.conf.set("spark.sql.pyspark.jvmStacktrace.enabled", True)
   >>> spark.sql('select 1/0').show()
   ...
   Traceback (most recent call last):
   ...
   pyspark.errors.exceptions.connect.ArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
   == SQL(line 1, position 8) ==
   select 1/0
          ^^^
   
   
   JVM stacktrace:
   org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
   == SQL(line 1, position 8) ==
   select 1/0
          ^^^
   
   	at org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:226)
   	at org.apache.spark.sql.catalyst.expressions.DivModLike.eval(arithmetic.scala:674)
   ...
   ```
   
   ### Why are the changes needed?
   
   Currently JVM stack trace is enabled by default.
   
   ```py
   % ./bin/pyspark --remote local
   ...
   >>> spark.conf.set("spark.sql.ansi.enabled", True)
   >>> spark.sql('select 1/0').show()
   ...
   Traceback (most recent call last):
   ...
   pyspark.errors.exceptions.connect.ArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
   == SQL(line 1, position 8) ==
   select 1/0
          ^^^
   
   JVM stacktrace:
   org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
   == SQL(line 1, position 8) ==
   select 1/0
          ^^^
   
   	at org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:226)
   	at org.apache.spark.sql.catalyst.expressions.DivModLike.eval(arithmetic.scala:674)
   ...
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   
   Users won't see the JVM stack trace by default.
   
   ### How was this patch tested?
   
   Existing tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41148: [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41148:
URL: https://github.com/apache/spark/pull/41148#discussion_r1191800825


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##########
@@ -125,7 +125,7 @@ class SparkConnectService(debug: Boolean)
       SparkConnectService
         .getOrCreateIsolatedSession(userId, sessionId)
         .session
-    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED.key, "true").toBoolean
+    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED)

Review Comment:
   I'm wondering if this PR aims to match with `PySpark Shell` environment or not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41148: [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41148:
URL: https://github.com/apache/spark/pull/41148#discussion_r1191800397


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##########
@@ -125,7 +125,7 @@ class SparkConnectService(debug: Boolean)
       SparkConnectService
         .getOrCreateIsolatedSession(userId, sessionId)
         .session
-    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED.key, "true").toBoolean
+    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED)

Review Comment:
   `Spark Shell` also shows JVM StackTrace by default, doesn't it, @ueshin ?
   > Users won't see the JVM stack trace by default.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #41148: [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #41148:
URL: https://github.com/apache/spark/pull/41148#issuecomment-1545089488

   Merged to master. Thank you, @ueshin and @allisonwang-db .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on a diff in pull request #41148: [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default

Posted by "ueshin (via GitHub)" <gi...@apache.org>.
ueshin commented on code in PR #41148:
URL: https://github.com/apache/spark/pull/41148#discussion_r1191805164


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##########
@@ -125,7 +125,7 @@ class SparkConnectService(debug: Boolean)
       SparkConnectService
         .getOrCreateIsolatedSession(userId, sessionId)
         .session
-    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED.key, "true").toBoolean
+    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED)

Review Comment:
   That's a good point.
   
   The original PR is for Spark Connect Python client and it was supposed to be disabled by default if it's not testing mode:
   https://github.com/apache/spark/blob/939c5f507b1d3225d9575ae39ab494df092c6c9e/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2780-L2789
   
   For Scala client, I think it's TBD? https://github.com/apache/spark/pull/40575#issuecomment-1512439357 cc @allisonwang-db 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41148: [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41148:
URL: https://github.com/apache/spark/pull/41148#discussion_r1191808572


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##########
@@ -125,7 +125,7 @@ class SparkConnectService(debug: Boolean)
       SparkConnectService
         .getOrCreateIsolatedSession(userId, sessionId)
         .session
-    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED.key, "true").toBoolean
+    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED)

Review Comment:
   Thank you!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #41148: [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun closed pull request #41148: [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default
URL: https://github.com/apache/spark/pull/41148


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] allisonwang-db commented on a diff in pull request #41148: [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default

Posted by "allisonwang-db (via GitHub)" <gi...@apache.org>.
allisonwang-db commented on code in PR #41148:
URL: https://github.com/apache/spark/pull/41148#discussion_r1191877562


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##########
@@ -125,7 +125,7 @@ class SparkConnectService(debug: Boolean)
       SparkConnectService
         .getOrCreateIsolatedSession(userId, sessionId)
         .session
-    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED.key, "true").toBoolean
+    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED)

Review Comment:
   Yea this `PYSPARK_JVM_STACKTRACE_ENABLED` flag is Python-specific. We don't have a corresponding flag for the Scala client.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] allisonwang-db commented on a diff in pull request #41148: [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default

Posted by "allisonwang-db (via GitHub)" <gi...@apache.org>.
allisonwang-db commented on code in PR #41148:
URL: https://github.com/apache/spark/pull/41148#discussion_r1191877562


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##########
@@ -125,7 +125,7 @@ class SparkConnectService(debug: Boolean)
       SparkConnectService
         .getOrCreateIsolatedSession(userId, sessionId)
         .session
-    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED.key, "true").toBoolean
+    val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED)

Review Comment:
   Yea this `PYSPARK_JVM_STACKTRACE_ENABLED` flag is Python-specific. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org