You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Benjamin Mahler (JIRA)" <ji...@apache.org> on 2015/05/15 03:34:59 UTC
[jira] [Updated] (MESOS-2507) Performance issue in the master when
a large number of slaves are registering.
[ https://issues.apache.org/jira/browse/MESOS-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Mahler updated MESOS-2507:
-----------------------------------
Sprint: Twitter Q2 Sprint 3 - 5/11
Assignee: Benjamin Mahler
Story Points: 5
> Performance issue in the master when a large number of slaves are registering.
> ------------------------------------------------------------------------------
>
> Key: MESOS-2507
> URL: https://issues.apache.org/jira/browse/MESOS-2507
> Project: Mesos
> Issue Type: Improvement
> Components: master
> Reporter: Benjamin Mahler
> Assignee: Benjamin Mahler
> Labels: scalability, twitter
>
> For large clusters, when a lot of slaves are registering, the master gets backlogged processing registration requests. {{perf}} revealed the following:
> {code}
> Events: 14K cycles
> 25.44% libmesos-0.22.0-x.so [.] mesos::internal::master::Master::registerSlave(process::UPID const&, mesos::SlaveInfo const&, std::vector<mesos::Resource, std::allocator<mesos::Resource> > cons
> 11.18% libmesos-0.22.0-x.so [.] pipecb
> 5.88% libc-2.5.so [.] malloc_consolidate
> 5.33% libc-2.5.so [.] _int_free
> 5.25% libc-2.5.so [.] malloc
> 5.23% libc-2.5.so [.] _int_malloc
> 4.11% libstdc++.so.6.0.8 [.] std::string::assign(std::string const&)
> 3.22% libmesos-0.22.0-x.so [.] mesos::Resource::SharedDtor()
> 3.10% [kernel] [k] _raw_spin_lock
> 1.97% libmesos-0.22.0-x.so [.] mesos::Attribute::SharedDtor()
> 1.28% libc-2.5.so [.] memcmp
> 1.08% libc-2.5.so [.] free
> {code}
> This is likely because we loop over all the slaves for each registration:
> {code}
> void Master::registerSlave(
> const UPID& from,
> const SlaveInfo& slaveInfo,
> const vector<Resource>& checkpointedResources,
> const string& version)
> {
> // ...
> // Check if this slave is already registered (because it retries).
> foreachvalue (Slave* slave, slaves.registered) {
> if (slave->pid == from) {
> // ...
> }
> }
> // ...
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)