You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Gerard Maas <ge...@gmail.com> on 2014/12/08 16:38:59 UTC

Understanding reported times on the Spark UI [+ Streaming]

Hi,

I'm confused about the Stage times reported on the Spark-UI (Spark 1.1.0)
for an Spark-Streaming job.  I'm hoping somebody can shine some light on it:

Let's do this with an example:

On the /stages page, stage # 232 is reported to have lasted 18 seconds:
232runJob at RDDFunctions.scala:23
<http://localhost:24040/stages/stage?id=232&attempt=0>+details

2014/12/08 15:06:2518 s
12/12
When I click on it for details, I see: [1]

Total time across all tasks = 42s

Aggregated metrics by executor:
Executor1 19s
Executor2 24s

Summing all tasks is actually: 40,009s

What is the time reported on the overview page? (18s?)

What is relation between the reported time on the overview and the detail
page?

My Spark Streaming job is reported to be taking 3m24s, and (I think)
there's only 1 stage in my job. How does the timing per stage relate to the
Spark Streaming reported in the 'streaming' page ? (e.g. 'last batch') ?

Is there a way to relate a streaming batch to the stages executed  to
complete that batch?
The numbers as they are at the moment don't seem to add up.

Thanks,

Gerard.


[1] https://drive.google.com/file/d/0BznIWnuWhoLlMkZubzY2dTdOWDQ