You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stu Hood (JIRA)" <ji...@apache.org> on 2011/02/12 08:51:57 UTC
[jira] Resolved: (CASSANDRA-2157) Hector concurrentHClient pool
gives out more connections than its quota
[ https://issues.apache.org/jira/browse/CASSANDRA-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stu Hood resolved CASSANDRA-2157.
---------------------------------
Resolution: Invalid
This is an awesome bug report, but the Cassandra project itself does not maintain Hector: you should probably re-file this bug with the developers on Github: https://github.com/rantav/hector
> Hector concurrentHClient pool gives out more connections than its quota
> -----------------------------------------------------------------------
>
> Key: CASSANDRA-2157
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2157
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.0
> Reporter: Yang Yang
>
> Hector ConcurrentHClient.java can give up on connection pool grabbing, in line 85 (following all refer to latest 0.7.0 head)
> } else {
> try {
> cassandraClient = availableClientQueue.poll(maxWaitTimeWhenExhausted, TimeUnit.MILLISECONDS);
> if ( cassandraClient == null ) {
> numBlocked.decrementAndGet();
> throw new PoolExhaustedException(String.format("maxWaitTimeWhenExhausted exceeded for thread %s on host %s",
> new Object[]{
> Thread.currentThread().getName(),
> cassandraHost.getName()}
> ));
> }
> } catch (InterruptedException ie) {
> //monitor.incCounter(Counter.POOL_EXHAUSTED);
> numActive.decrementAndGet();
> }
> so if we specify a maxwaittime, it could give up and **** do a numActive.decrementAndGet().
> but in the HConnectionManager.java
> public void operateWithFailover(Operation<?> op) throws HectorException {
> in the main loop of this method,
> client = getClientFromLBPolicy(excludeHosts);
> could throw Exception.
> in the catch part, there is a clause for
> } else if ( he instanceof PoolExhaustedException ) {
> retryable = true;
> --retries;
> if ( hostPools.size() == 1 ) {
> throw he;
> }
> monitor.incCounter(Counter.POOL_EXHAUSTED);
> excludeHosts.add(client.cassandraHost);
> }
> I guess this is written for the timeout scenario above, so it's supposed to catch that.
> but getClientFromLBPolicy() reconstructs a general HectorException from the PoolExhaustedException given by borrowClient().
> this makes all pool grabbing timeout immediately pop up to client, which I guess is not the original intention.
> so I guess getClientFromLBPolicy() needs to throw directly the original Exception. so as to trigger the logic in the catch part.
> but after I made those changes, I found that I often get ActiveNum() from the pool to be negative, and TillExhausted to be higher than the quota. this does not make sense.
> this was because that every code path goes through the line "releaseClient()" in the finally {} clause. so that on the pool grabbing , numActive.decrementAndGet() was already executed, and it also gets executed in the finally clause
> this end up creating many connections to the server, which bogs down the server , we have seen it creating huge cpu load
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira