You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Erik Krogen (Jira)" <ji...@apache.org> on 2020/10/19 23:39:00 UTC

[jira] [Updated] (SPARK-33185) YARN: Print direct links to driver logs alongside application report in cluster mode

     [ https://issues.apache.org/jira/browse/SPARK-33185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Krogen updated SPARK-33185:
--------------------------------
    Description: 
Currently when run in {{cluster}} mode on YARN, the Spark {{yarn.Client}} will print out the application report into the logs, to be easily viewed by users. For example:
{code}
INFO yarn.Client: 
 	 client token: Token { kind: YARN_CLIENT_TOKEN, service:  }
 	 diagnostics: N/A
 	 ApplicationMaster host: X.X.X.X
 	 ApplicationMaster RPC port: 0
 	 queue: default
 	 start time: 1602782566027
 	 final status: UNDEFINED
 	 tracking URL: http://hostname:8888/proxy/application_<id>/
 	 user: xkrogen
{code}
Typically, the tracking URL can be used to find the logs of the ApplicationMaster/driver while the application is running. Later, the Spark History Server can be used to track this information down, using the stdout/stderr links on the Executors page.

However, in the situation when the driver crashed _before_ writing out a history file, the SHS may not be aware of this application, and thus does not contain links to the driver logs. When this situation arises, it can be difficult for users to debug further, since they can't easily find their driver logs.

It is possible to reach the logs by using the {{yarn logs}} commands, but the average Spark user isn't aware of this and shouldn't have to be.

I propose adding, alongside the application report, some additional lines like:
{code}
         Driver Logs (stdout): http://hostname:8042/node/containerlogs/container_<id>/xkrogen/stdout?start=-4096
         Driver Logs (stderr): http://hostname:8042/node/containerlogs/container_<id>/xkrogen/stderr?start=-4096
{code}
With this information available, users can quickly jump to their driver logs, even if it crashed before the SHS became aware of the application. This has the additional benefit of providing a quick way to access driver logs, which often contain useful information, in a single click (instead of navigating through the Spark UI).

  was:
Currently when run in {{cluster}} mode on YARN, the Spark {{yarn.Client}} will print out the application report into the logs, to be easily viewed by users. For example:
{code}
INFO yarn.Client: 
 	 client token: Token { kind: YARN_CLIENT_TOKEN, service:  }
 	 diagnostics: N/A
 	 ApplicationMaster host: X.X.X.X
 	 ApplicationMaster RPC port: 0
 	 queue: default
 	 start time: 1602782566027
 	 final status: UNDEFINED
 	 tracking URL: http://hostname:8888/proxy/application_<id>/
 	 user: xkrogen
{code}
Typically, the tracking URL can be used to find the logs of the ApplicationMaster/driver while the application is running. Later, the Spark History Server can be used to track this information down, using the stdout/stderr links on the Executors page.

However, in the situation when the driver crashed _before_ writing out a history file, the SHS may not be aware of this application, and thus does not contain links to the driver logs. When this situation arises, it can be difficult for users to debug further, since they can't easily find their driver logs.

It is possible to reach the logs by using the {{yarn logs}} commands, but the average Spark user isn't aware of this and shouldn't have to be.

I propose adding, alongside the application report, some additional lines like:
{code}
         Driver Logs (stdout): http://hostname:8042/node/containerlogs/container_<id>/xkrogen/stdout?start=-4096
         Driver Logs (stderr): http://hostname:8042/node/containerlogs/container_<id>/ekrogen/stderr?start=-4096
{code}
With this information available, users can quickly jump to their driver logs, even if it crashed before the SHS became aware of the application. This has the additional benefit of providing a quick way to access driver logs, which often contain useful information, in a single click (instead of navigating through the Spark UI).


> YARN: Print direct links to driver logs alongside application report in cluster mode
> ------------------------------------------------------------------------------------
>
>                 Key: SPARK-33185
>                 URL: https://issues.apache.org/jira/browse/SPARK-33185
>             Project: Spark
>          Issue Type: Improvement
>          Components: YARN
>    Affects Versions: 3.0.1
>            Reporter: Erik Krogen
>            Priority: Major
>
> Currently when run in {{cluster}} mode on YARN, the Spark {{yarn.Client}} will print out the application report into the logs, to be easily viewed by users. For example:
> {code}
> INFO yarn.Client: 
>  	 client token: Token { kind: YARN_CLIENT_TOKEN, service:  }
>  	 diagnostics: N/A
>  	 ApplicationMaster host: X.X.X.X
>  	 ApplicationMaster RPC port: 0
>  	 queue: default
>  	 start time: 1602782566027
>  	 final status: UNDEFINED
>  	 tracking URL: http://hostname:8888/proxy/application_<id>/
>  	 user: xkrogen
> {code}
> Typically, the tracking URL can be used to find the logs of the ApplicationMaster/driver while the application is running. Later, the Spark History Server can be used to track this information down, using the stdout/stderr links on the Executors page.
> However, in the situation when the driver crashed _before_ writing out a history file, the SHS may not be aware of this application, and thus does not contain links to the driver logs. When this situation arises, it can be difficult for users to debug further, since they can't easily find their driver logs.
> It is possible to reach the logs by using the {{yarn logs}} commands, but the average Spark user isn't aware of this and shouldn't have to be.
> I propose adding, alongside the application report, some additional lines like:
> {code}
>          Driver Logs (stdout): http://hostname:8042/node/containerlogs/container_<id>/xkrogen/stdout?start=-4096
>          Driver Logs (stderr): http://hostname:8042/node/containerlogs/container_<id>/xkrogen/stderr?start=-4096
> {code}
> With this information available, users can quickly jump to their driver logs, even if it crashed before the SHS became aware of the application. This has the additional benefit of providing a quick way to access driver logs, which often contain useful information, in a single click (instead of navigating through the Spark UI).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org