You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jeff Evans <je...@gmail.com> on 2020/01/17 22:09:46 UTC
Is there a way to get the final web URL from an active Spark context
Given a session/context, we can get the UI web URL like this:
sparkSession.sparkContext.uiWebUrl
This gives me something like http://node-name.cluster-name:4040. If
opening this from outside the cluster (ex: my laptop), this redirects
via HTTP 302 to something like
http://node-name.cluster-name:8088/proxy/redirect/application_1579210019853_0023/.
For discussion purposes, call the latter one the "final web URL".
Critically, this final URL is active even after the application
terminates. The original uiWebUrl
(http://node-name.cluster-name:4040) is not available after the
application terminates, so one has to have captured the redirect in
time, if they want to provide a persistent link to that history server
UI entry (ex: for debugging purposes).
Is there a way, other than using some HTTP client, to detect what this
final URL will be directly from the SparkContext?
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Is there a way to get the final web URL from an active Spark context
Posted by Jeff Evans <je...@gmail.com>.
To answer my own question, it turns out what I was after is the YARN
ResourceManager URL for the Spark application. As alluded to in SPARK-20458
<https://issues.apache.org/jira/browse/SPARK-20458>, it's possible to use
the YARN API client to get this value. Here is a gist that shows how it
can be done (given an instance of the Hadoop Configuration object):
https://gist.github.com/jeff303/8dab0e52dc227741b6605f576a317798
On Fri, Jan 17, 2020 at 4:09 PM Jeff Evans <je...@gmail.com>
wrote:
> Given a session/context, we can get the UI web URL like this:
>
> sparkSession.sparkContext.uiWebUrl
>
> This gives me something like http://node-name.cluster-name:4040. If
> opening this from outside the cluster (ex: my laptop), this redirects
> via HTTP 302 to something like
>
> http://node-name.cluster-name:8088/proxy/redirect/application_1579210019853_0023/
> .
> For discussion purposes, call the latter one the "final web URL".
> Critically, this final URL is active even after the application
> terminates. The original uiWebUrl
> (http://node-name.cluster-name:4040) is not available after the
> application terminates, so one has to have captured the redirect in
> time, if they want to provide a persistent link to that history server
> UI entry (ex: for debugging purposes).
>
> Is there a way, other than using some HTTP client, to detect what this
> final URL will be directly from the SparkContext?
>