You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Abdullah Bashir <ma...@gmail.com> on 2017/10/26 21:37:00 UTC

How many jobs are left to calculate estimated time

Hi

I am running svd function on on 45GB data csv with 8.9M rows. I have
configured BLAST and ARPACK. So it's 14.8 hours since my job is running and
from Spark UI on port 4040

*Cores*

*Memory per Executor*

*State*

*Duration*

160

15.0 GB

RUNNING

14.4 h


In Jobs UI i am seeing

*Job Id  **▾*
<http://52.26.18.233:4040/jobs/?&completedJob.sort=Job+Id&completedJob.desc=false&completedJob.pageSize=100#completed>

*Description*
<http://52.26.18.233:4040/jobs/?&completedJob.sort=Description&completedJob.pageSize=100#completed>

*Submitted*
<http://52.26.18.233:4040/jobs/?&completedJob.sort=Submitted&completedJob.pageSize=100#completed>

*Duration*
<http://52.26.18.233:4040/jobs/?&completedJob.sort=Duration&completedJob.pageSize=100#completed>

*Stages: Succeeded/Total*

*Tasks (for all stages): Succeeded/Total*

1091

treeAggregate at RowMatrix.scala:93

2017/10/26 21:21:32

46 s

2/2

383/383


So above treeAggregate started FROM JOBS ID 6 and now it's 1091+ and
counting. In *logs* i am getting following messages.

[Stage 2179:===================================================>(363 + 2) /
365]

[Stage 2179:===================================================>(364 + 1) /
365]

[Stage 2209:===================================================>(364 + 1) /
365]


So

1. How can i identify how many jobs are left for this operation ?
2. also there is on .toLocalIterator task which will run after this. So i
have a understanding that .toLocalIterator jobs will be equal to Number of
cores in my system ?
3. Also why is it so slow ?



Best Regards,

*Abdullah Bashir*
*Senior Software Engineer,*
*Foretheta, LLC.*