You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "wangyum (via GitHub)" <gi...@apache.org> on 2024/03/18 13:26:21 UTC

[PR] [SPARK-47441][YARN] Do not add log link for unmanaged AM in Spark UI [spark]

wangyum opened a new pull request, #45565:
URL: https://github.com/apache/spark/pull/45565

   ### What changes were proposed in this pull request?
   
   This PR makes it do not add log link for unmanaged AM in Spark UI.
   
   ### Why are the changes needed?
   
   Avoid start driver error messages:
   ```
   24/03/18 04:58:25,022 ERROR [spark-listener-group-appStatus] scheduler.AsyncEventQueue:97 : Listener AppStatusListener threw an exception
   java.lang.NumberFormatException: For input string: "null"
   	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:67) ~[?:?]
   	at java.lang.Integer.parseInt(Integer.java:668) ~[?:?]
   	at java.lang.Integer.parseInt(Integer.java:786) ~[?:?]
   	at scala.collection.immutable.StringLike.toInt(StringLike.scala:310) ~[scala-library-2.12.18.jar:?]
   	at scala.collection.immutable.StringLike.toInt$(StringLike.scala:310) ~[scala-library-2.12.18.jar:?]
   	at scala.collection.immutable.StringOps.toInt(StringOps.scala:33) ~[scala-library-2.12.18.jar:?]
   	at org.apache.spark.util.Utils$.parseHostPort(Utils.scala:1105) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.status.ProcessSummaryWrapper.<init>(storeTypes.scala:609) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.status.LiveMiscellaneousProcess.doUpdate(LiveEntity.scala:1045) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.status.LiveEntity.write(LiveEntity.scala:50) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.status.AppStatusListener.update(AppStatusListener.scala:1233) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.status.AppStatusListener.onMiscellaneousProcessAdded(AppStatusListener.scala:1445) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.status.AppStatusListener.onOtherEvent(AppStatusListener.scala:113) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:100) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:117) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:101) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:105) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:105) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23) ~[scala-library-2.12.18.jar:?]
   	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) ~[scala-library-2.12.18.jar:?]
   	at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:100) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:96) ~[spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1356) [spark-core_2.12-3.5.1.jar:3.5.1]
   	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:96) [spark-core_2.12-3.5.1.jar:3.5.1]
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Manual testing:
   ```shell
   bin/spark-sql --master yarn  --conf spark.yarn.unmanagedAM.enabled=true 
   ```
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47441][YARN] Do not add log link for unmanaged AM in Spark UI [spark]

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on code in PR #45565:
URL: https://github.com/apache/spark/pull/45565#discussion_r1528619951


##########
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:
##########
@@ -788,9 +788,9 @@ private[spark] class ApplicationMaster(
 
     override def onStart(): Unit = {
       driver.send(RegisterClusterManager(self))
-      // if deployment mode for yarn Application is client
+      // if deployment mode for yarn Application is managed client
       // then send the AM Log Info to spark driver
-      if (!isClusterMode) {
+      if (!isClusterMode && !sparkConf.get(YARN_UNMANAGED_AM)) {

Review Comment:
   Only the driver process not show:
   <img width="1621" alt="image" src="https://github.com/apache/spark/assets/5399861/337a363d-e1b9-4ca3-b25a-b1f57c2cfec9">
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47441][YARN] Do not add log link for unmanaged AM in Spark UI [spark]

Posted by "tgravescs (via GitHub)" <gi...@apache.org>.
tgravescs commented on code in PR #45565:
URL: https://github.com/apache/spark/pull/45565#discussion_r1528603979


##########
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:
##########
@@ -788,9 +788,9 @@ private[spark] class ApplicationMaster(
 
     override def onStart(): Unit = {
       driver.send(RegisterClusterManager(self))
-      // if deployment mode for yarn Application is client
+      // if deployment mode for yarn Application is managed client
       // then send the AM Log Info to spark driver
-      if (!isClusterMode) {
+      if (!isClusterMode && !sparkConf.get(YARN_UNMANAGED_AM)) {

Review Comment:
   This causes the entire process to not show up in the executor page then?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47441][YARN] Do not add log link for unmanaged AM in Spark UI [spark]

Posted by "tgravescs (via GitHub)" <gi...@apache.org>.
tgravescs commented on code in PR #45565:
URL: https://github.com/apache/spark/pull/45565#discussion_r1528642004


##########
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:
##########
@@ -788,9 +788,9 @@ private[spark] class ApplicationMaster(
 
     override def onStart(): Unit = {
       driver.send(RegisterClusterManager(self))
-      // if deployment mode for yarn Application is client
+      // if deployment mode for yarn Application is managed client
       // then send the AM Log Info to spark driver
-      if (!isClusterMode) {
+      if (!isClusterMode && !sparkConf.get(YARN_UNMANAGED_AM)) {

Review Comment:
   sorry, I don't know what you mean by that? is this the after picture and if so what was the before picture?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47441][YARN] Do not add log link for unmanaged AM in Spark UI [spark]

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on code in PR #45565:
URL: https://github.com/apache/spark/pull/45565#discussion_r1528652678


##########
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:
##########
@@ -788,9 +788,9 @@ private[spark] class ApplicationMaster(
 
     override def onStart(): Unit = {
       driver.send(RegisterClusterManager(self))
-      // if deployment mode for yarn Application is client
+      // if deployment mode for yarn Application is managed client
       // then send the AM Log Info to spark driver
-      if (!isClusterMode) {
+      if (!isClusterMode && !sparkConf.get(YARN_UNMANAGED_AM)) {

Review Comment:
   There is no difference on the executor page, but there will be an error when starting the driver.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47441][YARN] Do not add log link for unmanaged AM in Spark UI [spark]

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on PR #45565:
URL: https://github.com/apache/spark/pull/45565#issuecomment-2003905168

   cc @SaurabhChawla100 @tgravescs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47441][YARN] Do not add log link for unmanaged AM in Spark UI [spark]

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on code in PR #45565:
URL: https://github.com/apache/spark/pull/45565#discussion_r1528652678


##########
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:
##########
@@ -788,9 +788,9 @@ private[spark] class ApplicationMaster(
 
     override def onStart(): Unit = {
       driver.send(RegisterClusterManager(self))
-      // if deployment mode for yarn Application is client
+      // if deployment mode for yarn Application is managed client
       // then send the AM Log Info to spark driver
-      if (!isClusterMode) {
+      if (!isClusterMode && !sparkConf.get(YARN_UNMANAGED_AM)) {

Review Comment:
   There is no difference  on the executor page, but there will be an error when starting the driver.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org