You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xu Zhongxing (JIRA)" <ji...@apache.org> on 2014/08/13 11:43:12 UTC

[jira] [Issue Comment Deleted] (SPARK-2204) Scheduler for Mesos in fine-grained mode launches tasks on wrong executors

     [ https://issues.apache.org/jira/browse/SPARK-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xu Zhongxing updated SPARK-2204:
--------------------------------

    Comment: was deleted

(was: I encountered this issue again when I use Spark 1.0.2, Mesos 0.18.1, spark-cassandra-connector master branch.

Maybe this is not fixed on some failure/exception paths.

I run spark in coarse-grained mode. There are some exceptions thrown at the executors. But the spark driver is waiting and printing repeatedly:

TRACE [spark-akka.actor.default-dispatcher-17] 2014-08-11 10:57:32,998 Logging.scala (line 66) Checking for hosts with\
 no recent heart beats in BlockManagerMaster.

The mesos master WARNING log:
W0811 10:32:58.172175 1646 master.cpp:2103] Ignoring unknown exited executor 20140808-113811-858302656-5050-1645-2 on slave 20140808-113811-858302656-505\
0-1645-2 (ndb9)
W0811 10:32:58.181217 1649 master.cpp:2103] Ignoring unknown exited executor 20140808-113811-858302656-5050-1645-5 on slave 20140808-113811-858302656-505\
0-1645-5 (ndb5)
W0811 10:32:58.277014 1650 master.cpp:2103] Ignoring unknown exited executor 20140808-113811-858302656-5050-1645-3 on slave 20140808-113811-858302656-505\
0-1645-3 (ndb6)
W0811 10:32:58.344130 1648 master.cpp:2103] Ignoring unknown exited executor 20140808-113811-858302656-5050-1645-0 on slave 20140808-113811-858302656-505\
0-1645-0 (ndb0)
W0811 10:32:58.354117 1651 master.cpp:2103] Ignoring unknown exited executor 20140804-095254-505981120-5050-20258-11 on slave 20140804-095254-505981120-5\
050-20258-11 (ndb2)
W0811 10:32:58.550233 1647 master.cpp:2103] Ignoring unknown exited executor 20140804-172212-505981120-5050-26571-2 on slave 20140804-172212-505981120-50\
50-26571-2 (ndb3)
W0811 10:32:58.793258 1653 master.cpp:2103] Ignoring unknown exited executor 20140804-095254-505981120-5050-20258-19 on slave 20140804-095254-505981120-5\
050-20258-19 (ndb1)
W0811 10:32:58.904842 1652 master.cpp:2103] Ignoring unknown exited executor 20140804-172212-505981120-5050-26571-0 on slave 20140804-172212-505981120-50\
50-26571-0 (ndb4)

Some other logs are at: 
https://github.com/datastax/spark-cassandra-connector/issues/134
)

> Scheduler for Mesos in fine-grained mode launches tasks on wrong executors
> --------------------------------------------------------------------------
>
>                 Key: SPARK-2204
>                 URL: https://issues.apache.org/jira/browse/SPARK-2204
>             Project: Spark
>          Issue Type: Bug
>          Components: Mesos
>    Affects Versions: 1.0.0
>            Reporter: Sebastien Rainville
>            Assignee: Sebastien Rainville
>            Priority: Blocker
>             Fix For: 1.0.1, 1.1.0
>
>
> MesosSchedulerBackend.resourceOffers(SchedulerDriver, List[Offer]) is assuming that TaskSchedulerImpl.resourceOffers(Seq[WorkerOffer]) is returning task lists in the same order as the offers it was passed, but in the current implementation TaskSchedulerImpl.resourceOffers shuffles the offers to avoid assigning the tasks always to the same executors. The result is that the tasks are launched on the wrong executors.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org