You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by DB Tsai <db...@alpinenow.com> on 2014/01/24 02:11:41 UTC

Submitting job to Yarn's ResourceManager

Hi guys,

We were able to submit our spark application to yarn's resource manager
(PivotalHD 1.0.1), and the expected result is printed in the stdout of the
container log; however, when we took a look at Tracking UI, we got the
following error 500. There is no problem for traditional mapreduce job. Is
there any suggestion to dig into this problem?

Secondly, we submitted our job through our java app, and since the
application is actually running on the remote machine, it seems that there
is no easy way to interact between our main application running locally,
and the spark application running remotely. For example, if we would like
to know the progress of our spark application, we have to write it to HDFS,
and then read it back in our main application. It's the same to retrieve
the final spark job return. Is there any way to interact through some kind
of api without going through the HDFS?

Finally, sometimes, we would like to test some code quickly through
spark-shell. And it seems that every operation is running locally when we
lunch spark-shell in yarn-client mode (I knew it's not supported to run
spark-shell in yarn-standalone mode). Is there anyway to make spark-shell
running in the distributed fashion with yarn?

Thanks.
HTTP ERROR 500

Problem accessing /proxy/application_1389853114516_1011/. Reason:

    Server Error

Caused by:

java.io.IOException: java.net.URISyntaxException: Expected authority
at index 7: http://
	at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:318)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:652)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1320)



Sincerely,

DB Tsai
Machine Learning Engineer
Alpine Data Labs
--------------------------------------
Web: http://alpinenow.com/

Re: Submitting job to Yarn's ResourceManager

Posted by Tom Graves <tg...@yahoo.com>.

The tracking url will error if the application has completed or if you are running yarn client mode (that hasn't bee hooked up) . In yarn client mode you can get url for UI via the messages that come out on the console.

In yarn-client mode, are you setting the environment variables to ask for workers, etc. Also make sure the hadoop conf directory is in your classpath.

Tom

On Thursday, January 23, 2014 7:12 PM, DB Tsai <db...@alpinenow.com> wrote:

Hi guys,

We were able to submit our spark application to yarn's resource manager (PivotalHD 1.0.1), and the expected result is printed in the stdout of the container log; however, when we took a look at Tracking UI, we got the following error 500. There is no problem for traditional mapreduce job. Is there any suggestion to dig into this problem?

Secondly, we submitted our job through our java app, and since the application is actually running on the remote machine, it seems that there is no easy way to interact between our main application running locally, and the spark application running remotely. For example, if we would like to know the progress of our spark application, we have to write it to HDFS, and then read it back in our main application. It's the same to retrieve the final spark job return. Is there any way to interact through some kind of api without going through the HDFS?

Finally, sometimes, we would like to test some code quickly through spark-shell. And it seems that every operation is running locally when we lunch spark-shell in yarn-client mode (I knew it's not supported to run spark-shell in yarn-standalone mode). Is there anyway to make spark-shell running in the distributed fashion with yarn?

Thanks.
HTTP ERROR 500
Problem accessing /proxy/application_1389853114516_1011/. Reason:
Server Error
Caused by:
java.io.IOException: java.net.URISyntaxException: Expected authority at index 7: http:// at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:318) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:652) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1320)

Sincerely,

DB Tsai
Machine Learning Engineer
Alpine Data Labs
--------------------------------------
Web: http://alpinenow.com/