You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ankur Srivastava <an...@gmail.com> on 2017/03/07 02:40:14 UTC

Spark Jobs under Cluster Manager

Hello Users,

We have recently upgraded from Spark 1.6 to Spark 2.0. Until 1.6 we only
saw actions listed under the Jobs page on Yarn CM. On clicking on the
action it would list the stages based on where in the DAG the data would be
shuffled but in 2.0 the behavior seems to be significantly different.

After every action we see a "run at ThreadPoolExecutor.java:1142
<http://dedwfprshn01.de.neustar.com:8088/proxy/application_1488473843724_0481/jobs/job?id=7>"
with multiple stages. It does not point to any action and even the stages
list the same. I have no clue what is happening under the hood.

Attached is a Job page from the current application run. How can I know
what is running at each of these stages and optimize it?

Thanks for help!!

Regards
Ankur