You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Benjamin Mahler (JIRA)" <ji...@apache.org> on 2015/06/26 02:37:04 UTC
[jira] [Updated] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.

     [ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Mahler updated MESOS-2940:
-----------------------------------
    Attachment: perf-kernel.svg

Attached a flame graph, it turns out that UUID::random() is expensive, from [here|https://github.com/apache/mesos/blob/f62530487198ce170e664e47902ddd5ab2b90d6f/src/common/protobuf_utils.cpp#L62]:

{noformat}
[ RUN      ] TaskCount/Reconciliation_BENCHMARK_Test.Explicit/3
Starting reconciliation of 100000 tasks
Reconciled 100000 tasks in 3.01183433973333mins
[       OK ] TaskCount/Reconciliation_BENCHMARK_Test.Explicit/3 (183554 ms)
{noformat}

Removing [this line|https://github.com/apache/mesos/blob/f62530487198ce170e664e47902ddd5ab2b90d6f/src/common/protobuf_utils.cpp#L62] leads to the following:

{noformat}
[ RUN      ] TaskCount/Reconciliation_BENCHMARK_Test.Explicit/3
Starting reconciliation of 100000 tasks
Reconciled 100000 tasks in 2.67714075secs
[       OK ] TaskCount/Reconciliation_BENCHMARK_Test.Explicit/3 (5201 ms)
{noformat}

Fortunately, master-generated status updates do not need UUIDs, so should be fixable.

The benchmark I wrote for MESOS-2941 was a bit of a hack, I ended up just duplicating the code structure rather than trying to launch a lot of tasks on an in-process cluster.

> Reconciliation is expensive for large numbers of tasks.
> -------------------------------------------------------
>
>                 Key: MESOS-2940
>                 URL: https://issues.apache.org/jira/browse/MESOS-2940
>             Project: Mesos
>          Issue Type: Improvement
>          Components: master
>            Reporter: Benjamin Mahler
>            Priority: Critical
>              Labels: twitter
>         Attachments: perf-kernel.svg
>
>
> We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks:
> {noformat: title=Explicit O(100,000) tasks: 70secs}
> I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT
> I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST)
> {noformat}
> {noformat: title=Implicit with O(100,000) tasks: 60secs}
> I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT
> I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout
> {noformat}
> Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)