You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/07/22 07:52:20 UTC

[jira] [Commented] (SPARK-16676) Spark jobs stay in pending

    [ https://issues.apache.org/jira/browse/SPARK-16676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389124#comment-15389124 ] 

Sean Owen commented on SPARK-16676:
-----------------------------------

Did your executors schedule? your other operations sound like transformation which don't do anything when executed as commands, so it doesn't mean anything that they returned immediately. This doesn't look like enough info to suggest a Spark problem.

> Spark jobs stay in pending
> --------------------------
>
>                 Key: SPARK-16676
>                 URL: https://issues.apache.org/jira/browse/SPARK-16676
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib, Spark Shell
>    Affects Versions: 1.5.2
>         Environment: Mac OS X Yosemite, Terminal, Spark-shell standalone
>            Reporter: Joe Chong
>         Attachments: Spark UI stays @ pending.png
>
>
> I've been having issues executing certain Scala statements within the Spark-Shell. These statements are obtained through tutorial/blog written by Carol McDonald in MapR. 
> The import statements, reading text files into DataFrames are OK. However, when I try to do df.show(), the execution hits a road block. Checking the Spark UI job, I see that the Stage's active, however, 1 of its dependent job stays in Pending without any movement. The logs are as below. 
> scala> fltCountsql.show()
> 16/07/22 11:40:16 INFO spark.SparkContext: Starting job: show at <console>:46
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Registering RDD 31 (show at <console>:46)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Got job 4 (show at <console>:46) with 200 output partitions
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Final stage: ResultStage 8(show at <console>:46)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 7)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 7)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 7 (MapPartitionsRDD[31] at show at <console>:46), which has no missing parents
> 16/07/22 11:40:16 INFO storage.MemoryStore: ensureFreeSpace(18128) called with curMem=115755879, maxMem=2778495713
> 16/07/22 11:40:16 INFO storage.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 17.7 KB, free 2.5 GB)
> 16/07/22 11:40:16 INFO storage.MemoryStore: ensureFreeSpace(7527) called with curMem=115774007, maxMem=2778495713
> 16/07/22 11:40:16 INFO storage.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 7.4 KB, free 2.5 GB)
> 16/07/22 11:40:16 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on localhost:61408 (size: 7.4 KB, free: 2.5 GB)
> 16/07/22 11:40:16 INFO spark.SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:861
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 7 (MapPartitionsRDD[31] at show at <console>:46)
> 16/07/22 11:40:16 INFO scheduler.TaskSchedulerImpl: Adding task set 7.0 with 2 tasks
> 16/07/22 11:40:16 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 7.0 (TID 4, localhost, PROCESS_LOCAL, 2156 bytes)
> 16/07/22 11:40:16 INFO executor.Executor: Running task 0.0 in stage 7.0 (TID 4)
> 16/07/22 11:40:16 INFO storage.BlockManager: Found block rdd_2_0 locally
> 16/07/22 11:40:17 INFO executor.Executor: Finished task 0.0 in stage 7.0 (TID 4). 2738 bytes result sent to driver
> 16/07/22 11:40:17 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 7.0 (TID 4) in 920 ms on localhost (1/2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org