You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Benjamin Mahler (JIRA)" <ji...@apache.org> on 2014/12/09 02:23:12 UTC
[jira] [Updated] (MESOS-2182) Performance issue in libprocess SocketManager.

     [ https://issues.apache.org/jira/browse/MESOS-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Mahler updated MESOS-2182:
-----------------------------------
    Affects Version/s: 0.21.0

> Performance issue in libprocess SocketManager.
> ----------------------------------------------
>
>                 Key: MESOS-2182
>                 URL: https://issues.apache.org/jira/browse/MESOS-2182
>             Project: Mesos
>          Issue Type: Bug
>          Components: libprocess
>    Affects Versions: 0.21.0
>            Reporter: Benjamin Mahler
>            Priority: Blocker
>
> Noticed an issue in production under which the master is slow to respond after failover for ~15 minutes.
> After looking at some perf data, the top offender is:
> {noformat}
>     12.02%  mesos-master  libmesos-0.21.0-rc3.so  [.] std::_Rb_tree<process::ProcessBase*, process::ProcessBase*, std::_Identity<process::ProcessBase*>, std::less<process::ProcessBase*>, std::allocator<process::ProcessBase*> >::erase(process::ProcessBase* const&)
> ...
>      3.29%  mesos-master  libmesos-0.21.0-rc3.so  [.] process::SocketManager::exited(process::ProcessBase*)
> {noformat}
> It appears that in the SocketManager, whenever an internal Process exits, we loop over all the links unnecessarily:
> {code}
> void SocketManager::exited(ProcessBase* process)
> {
>   // An exited event is enough to cause the process to get deleted
>   // (e.g., by the garbage collector), which means we can't
>   // dereference process (or even use the address) after we enqueue at
>   // least one exited event. Thus, we save the process pid.
>   const UPID pid = process->pid;
>   // Likewise, we need to save the current time of the process so we
>   // can update the clocks of linked processes as appropriate.
>   const Time time = Clock::now(process);
>   synchronized (this) {
>     // Iterate through the links, removing any links the process might
>     // have had and creating exited events for any linked processes.
>     foreachpair (const UPID& linkee, set<ProcessBase*>& processes, links) {
>       processes.erase(process);
>       if (linkee == pid) {
>         foreach (ProcessBase* linker, processes) {
>           CHECK(linker != process) << "Process linked with itself";
>           synchronized (timeouts) {
>             if (Clock::paused()) {
>               Clock::update(linker, time);
>             }
>           }
>           linker->enqueue(new ExitedEvent(linkee));
>         }
>       }
>     }
>     links.erase(pid);
>   }
> }
> {code}
> On clusters with 10,000s of slaves, this means we hold the socket manager lock for a very expensive loop erasing nothing from a set! This is because, the master contains links from the Master Process to each slave. However, when a random ephemeral Process terminates, we don't need to loop over each slave link.
> While we hold this lock, the following calls will block:
> {code}
> class SocketManager
> {
> public:
>   Socket accepted(int s);
>   void link(ProcessBase* process, const UPID& to);
>   PID<HttpProxy> proxy(const Socket& socket);
>   void send(Encoder* encoder, bool persist);
>   void send(const Response& response,
>             const Request& request,
>             const Socket& socket);
>   void send(Message* message);
>   Encoder* next(int s);
>   void close(int s);
>   void exited(const Node& node);
>   void exited(ProcessBase* process);
> ...
> {code}
> As a result, the slave observers and the master can block calling send()!
> Short term, we will try to fix this issue by removing the unnecessary looping. Longer term, it would be nice to avoid all this locking when sending on independent sockets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)