You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Todd <bi...@163.com> on 2015/08/17 06:09:14 UTC

Understanding the two jobs run with spark sql join

Hi,I have a basic spark sql join run in the local mode. I checked the UI,and see that there are two jobs are run. There DAG graph are pasted at the end.
I have several questions here:
1. Looks that Job0 and Job1 all have the same DAG Stages, but the stage 3 and stage4 are skipped. I would ask what job 0 and job1 each do, why they have the same DAG graph and why stage3 and stage4 are skipped.
2. Job0 has only 5 tasks. What controls the number of tasks in the job0?
3. Job0 has 5 tasks and job1 has 199 tasks. I thought that the number of tasks of job1 are controlled by the ,which is 200 by default. And why it shows 199 here.



Job0:



Job1: