You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Daryn Sharp (JIRA)" <ji...@apache.org> on 2015/07/23 00:11:06 UTC

[jira] [Commented] (HADOOP-11772) RPC Invoker relies on static ClientCache which has synchronized(this) blocks

    [ https://issues.apache.org/jira/browse/HADOOP-11772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637738#comment-14637738 ] 

Daryn Sharp commented on HADOOP-11772:
--------------------------------------

A very delayed -1.  CacheBuilder is obscenely expensive for concurrent map, and it requires generating unnecessary garbage even just to look up a key.  Replace it with ConcurrentHashMap.

I identified this issue that impaired my own perf testing under load.  The slowdown isn't just the sync.  It's the expensive of Connection's ctor stalling other connections.  The expensive of ConnectionId#equals causes delays.  Synch'ing on connections causes unfair contention unlike a sync'ed method.  Concurrency simply hides this.

If I can ever escape from the, um, "fun" of stabilizing our production clusters, I'll dig out a patch that is more efficient than a CHM.  In the meantime, use a CHM.  Either re-fix on this jira, or file another critical jira.

> RPC Invoker relies on static ClientCache which has synchronized(this) blocks
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-11772
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11772
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ipc, performance
>            Reporter: Gopal V
>            Assignee: Haohui Mai
>             Fix For: 2.8.0
>
>         Attachments: HADOOP-11772-001.patch, HADOOP-11772-002.patch, HADOOP-11772-003.patch, HADOOP-11772-wip-001.patch, HADOOP-11772-wip-002.patch, HADOOP-11772.004.patch, after-ipc-fix.png, cached-connections.png, cached-locking.png, dfs-sync-ipc.png, sync-client-bt.png, sync-client-threads.png
>
>
> {code}
>   private static ClientCache CLIENTS=new ClientCache();
> ...
>     this.client = CLIENTS.getClient(conf, factory);
> {code}
> Meanwhile in ClientCache
> {code}
> public synchronized Client getClient(Configuration conf,
>       SocketFactory factory, Class<? extends Writable> valueClass) {
> ...
>    Client client = clients.get(factory);
>     if (client == null) {
>       client = new Client(valueClass, conf, factory);
>       clients.put(factory, client);
>     } else {
>       client.incCount();
>     }
> {code}
> All invokers end up calling these methods, resulting in IPC clients choking up.
> !sync-client-threads.png!
> !sync-client-bt.png!
> !dfs-sync-ipc.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)