You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Mario Ivanac (Jira)" <ji...@apache.org> on 2020/02/19 12:51:00 UTC

[jira] [Assigned] (GEODE-6536) client pr single hop will do multiple hops if things get busy

     [ https://issues.apache.org/jira/browse/GEODE-6536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mario Ivanac reassigned GEODE-6536:
-----------------------------------

    Assignee: Mario Ivanac

> client pr single hop will do multiple hops if things get busy
> -------------------------------------------------------------
>
>                 Key: GEODE-6536
>                 URL: https://issues.apache.org/jira/browse/GEODE-6536
>             Project: Geode
>          Issue Type: Improvement
>          Components: client/server
>            Reporter: Darrel Schneider
>            Assignee: Mario Ivanac
>            Priority: Major
>              Labels: performance
>
> I was wondering why ConnectionManagerImpl.borrowConnection that targets a particular server will not wait for a connection to be available. This is the borrow used by pr single hop. It seems like if all the cnxs to that server are busy and we are at max that it should wait. But instead it does a single scan of the pool and if it does not find an idle connection it either throws AllConnectionsInUse or it creates one. The caller sets the onlyUseExistingCnx parameter based on if the pool is at max cnxs or not (seems like this should have been done in borrowConnection instead of the caller). If it throws AllConnectionsInUse the caller catches it and ignores it and then falls into doing a non-targeted borrow that will wait for a connection but to ANY server. So our single hop optimization goes away on a busy client. Since it can’t immediately get a connection to server A, it says just give me a connection to any server sends the op to that other server and that server ends up having to forward it to server A. You would only see this problem if you set a max on your client connection pool. The default of -1 just says to always create another connection. The only workaround I know of is to not limit how many connections your client can create but that could cause other problem (i.e. too many file descriptors).
> The questionable code is in the following classes:
> org.apache.geode.cache.client.internal.GetOp
> org.apache.geode.cache.client.internal.PutOp
> org.apache.geode.cache.client.internal.DestroyOp
> The all have something like this:
>  
> {code:java}
> boolean onlyUseExistingCnx = ((poolImpl.getMaxConnections() != -1
>  && poolImpl.getConnectionCount() >= poolImpl.getMaxConnections()) ? true : false);
>  op.setAllowDuplicateMetadataRefresh(!onlyUseExistingCnx);
>  return pool.executeOn(new ServerLocation(server.getHostName(), server.getPort()), op,
>  true, onlyUseExistingCnx);
> {code}
> executeOn eventually calls ConnectionManagerImpl.borrowConnection 
> It is unclear to me why they need allow duplicate metadata refresh if they permit borrowConnection to create a new connection to the specified server. They all catch and ignore 
> AllConnectionsInUseException falling through and doing non-targetted pool.execute.
> The concern might have been that the pool would be full of connections to other servers preventing us from connecting to the server we need to communicate with. If we have existing connections to the server we could wait for one of those. If we don't and we are at max then we should consider closing a connection to some other server and create a connection to the server we know we need for this operation. As the server count goes up and the max connection count on the pool goes down I'm sure we are probably better off not even trying to do single hop since we might not even be able to have a connection to each server in the pool. But if we have a reasonable max connection count then it seems like we could make more of an effort to make sure a per server connection count is balanced and then wait for one of those connection to be available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)