You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Joe Chong (JIRA)" <ji...@apache.org> on 2016/07/22 03:53:20 UTC
[jira] [Updated] (SPARK-16676) Spark jobs stay in pending

     [ https://issues.apache.org/jira/browse/SPARK-16676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe Chong updated SPARK-16676:
------------------------------
    Attachment: Spark UI stays @ pending.png

> Spark jobs stay in pending
> --------------------------
>
>                 Key: SPARK-16676
>                 URL: https://issues.apache.org/jira/browse/SPARK-16676
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib, Spark Shell
>    Affects Versions: 1.5.2
>         Environment: Mac OS X Yosemite, Terminal, Spark-shell standalone
>            Reporter: Joe Chong
>         Attachments: Spark UI stays @ pending.png
>
>
> I've been having issues executing certain Scala statements within the Spark-Shell. These statements are obtained through tutorial/blog written by Carol McDonald in MapR. 
> The import statements, reading text files into DataFrames are OK. However, when I try to do df.show(), the execution hits a road block. Checking the Spark UI job, I see that the Stage's active, however, 1 of its dependent job stays in Pending without any movement. The logs are as below. 
> scala> fltCountsql.show()
> 16/07/22 11:40:16 INFO spark.SparkContext: Starting job: show at <console>:46
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Registering RDD 31 (show at <console>:46)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Got job 4 (show at <console>:46) with 200 output partitions
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Final stage: ResultStage 8(show at <console>:46)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 7)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 7)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 7 (MapPartitionsRDD[31] at show at <console>:46), which has no missing parents
> 16/07/22 11:40:16 INFO storage.MemoryStore: ensureFreeSpace(18128) called with curMem=115755879, maxMem=2778495713
> 16/07/22 11:40:16 INFO storage.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 17.7 KB, free 2.5 GB)
> 16/07/22 11:40:16 INFO storage.MemoryStore: ensureFreeSpace(7527) called with curMem=115774007, maxMem=2778495713
> 16/07/22 11:40:16 INFO storage.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 7.4 KB, free 2.5 GB)
> 16/07/22 11:40:16 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on localhost:61408 (size: 7.4 KB, free: 2.5 GB)
> 16/07/22 11:40:16 INFO spark.SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:861
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 7 (MapPartitionsRDD[31] at show at <console>:46)
> 16/07/22 11:40:16 INFO scheduler.TaskSchedulerImpl: Adding task set 7.0 with 2 tasks
> 16/07/22 11:40:16 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 7.0 (TID 4, localhost, PROCESS_LOCAL, 2156 bytes)
> 16/07/22 11:40:16 INFO executor.Executor: Running task 0.0 in stage 7.0 (TID 4)
> 16/07/22 11:40:16 INFO storage.BlockManager: Found block rdd_2_0 locally
> 16/07/22 11:40:17 INFO executor.Executor: Finished task 0.0 in stage 7.0 (TID 4). 2738 bytes result sent to driver
> 16/07/22 11:40:17 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 7.0 (TID 4) in 920 ms on localhost (1/2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org