You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rafał Kuć <r....@solr.pl> on 2014/03/29 01:07:23 UTC

SolrCloud 4.6.1 hanging

Hello!

I have an issue with one of the SolrCloud deployments and I wanted to
ask maybe someone had a similar issue. Six machines, a collection with
6 shards with a replication factor of 3. It all runs on 6 physical
servers, each with 24 cores. We've indexed about 32 milion documents
and everything was fine until that point.

Now, during performance tests, we run into an issue - SolrCloud hangs
when querying and indexing is run at the same time. First we see a
normal load on the machines, than the load starts to drop and thread
dump shown numerous threads like this:

Thread 12624: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=2043 (Compiled frame)
 - org.apache.http.pool.PoolEntryFuture.await(java.util.Date) @bci=50, line=131 (Compiled frame)
 - org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(java.lang.Object, java.lang.Object, long, java.util.concurrent.TimeUnit, org.apache.http.pool.PoolEntryFuture) @bci=431, line=281 (Compiled frame)
 - org.apache.http.pool.AbstractConnPool.access$000(org.apache.http.pool.AbstractConnPool, java.lang.Object, java.lang.Object, long, java.util.concurrent.TimeUnit, org.apache.http.pool.PoolEntryFuture) @bci=8, line=62 (Compiled frame)
 - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, java.util.concurrent.TimeUnit) @bci=15, line=176 (Compiled frame)
 - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, java.util.concurrent.TimeUnit) @bci=3, line=169 (Compiled frame)
 - org.apache.http.pool.PoolEntryFuture.get(long, java.util.concurrent.TimeUnit) @bci=38, line=100 (Compiled frame)
 - org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(java.util.concurrent.Future, long, java.util.concurrent.TimeUnit) @bci=4, line=212 (Compiled frame)
 - org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(long, java.util.concurrent.TimeUnit) @bci=10, line=199 (Compiled frame)
 - org.apache.http.impl.client.DefaultRequestDirector.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=259, line=456 (Compiled frame)
 - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=344, line=906 (Compiled frame)
 - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest, org.apache.http.protocol.HttpContext) @bci=21, line=805 (Compiled frame)
 - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest) @bci=6, line=784 (Compiled frame)
 - org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest, org.apache.solr.client.solrj.ResponseParser) @bci=1175, line=395 (Interpreted frame)
 - org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest) @bci=17, line=199 (Compiled frame)
 - org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(org.apache.solr.client.solrj.impl.LBHttpSolrServer$Req) @bci=132, line=285 (Interpreted frame)
 - org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(org.apache.solr.client.solrj.request.QueryRequest, java.util.List) @bci=13, line=214 (Compiled frame)
 - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=246, line=161 (Compiled frame)
 - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=1, line=118 (Interpreted frame)
 - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 (Interpreted frame)
 - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame)
 - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=471 (Interpreted frame)
 - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 (Interpreted frame)
 - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame)

I've checked I/O statistics, GC working, memory usage, networking and
all of that - those resources are not exhausted during the test.

Hard autocommit is set to 15 seconds with openSearcher=false and
softAutocommit to 4 hours. We have a fairly high query rate, but until
we start indexing everything runs smooth.

Has anyone encountered similar behavior?

-- 
Regards,
 Rafał Kuć
 Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch


Re: SolrCloud 4.6.1 hanging

Posted by Rafał Kuć <r....@solr.pl>.
Hello!

I've created SOLR-5935, please let me know if some more information is
needed. I'll be glad to help on the issue.

-- 
Regards,
 Rafał Kuć
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


> I'm looking into a hang as well - not sure of it involves searching
> as well, but it may. Can you file a JIRA issue - let's track it down.

> - Mark

>> On Mar 28, 2014, at 8:07 PM, Rafał Kuć <r....@solr.pl> wrote:
>> 
>> Hello!
>> 
>> I have an issue with one of the SolrCloud deployments and I wanted to
>> ask maybe someone had a similar issue. Six machines, a collection with
>> 6 shards with a replication factor of 3. It all runs on 6 physical
>> servers, each with 24 cores. We've indexed about 32 milion documents
>> and everything was fine until that point.
>> 
>> Now, during performance tests, we run into an issue - SolrCloud hangs
>> when querying and indexing is run at the same time. First we see a
>> normal load on the machines, than the load starts to drop and thread
>> dump shown numerous threads like this:
>> 
>> Thread 12624: (state = BLOCKED)
>> - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
>> - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Compiled frame)
>> - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=2043 (Compiled frame)
>> - org.apache.http.pool.PoolEntryFuture.await(java.util.Date) @bci=50, line=131 (Compiled frame)
>> - org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(java.lang.Object, java.lang.Object, long, java.util.concurrent.TimeUnit, org.apache.http.pool.PoolEntryFuture) @bci=431, line=281 (Compiled frame)
>> - org.apache.http.pool.AbstractConnPool.access$000(org.apache.http.pool.AbstractConnPool, java.lang.Object, java.lang.Object, long, java.util.concurrent.TimeUnit, org.apache.http.pool.PoolEntryFuture) @bci=8, line=62 (Compiled frame)
>> - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, java.util.concurrent.TimeUnit) @bci=15, line=176 (Compiled frame)
>> - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, java.util.concurrent.TimeUnit) @bci=3, line=169 (Compiled frame)
>> - org.apache.http.pool.PoolEntryFuture.get(long, java.util.concurrent.TimeUnit) @bci=38, line=100 (Compiled frame)
>> - org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(java.util.concurrent.Future, long, java.util.concurrent.TimeUnit) @bci=4, line=212 (Compiled frame)
>> - org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(long, java.util.concurrent.TimeUnit) @bci=10, line=199 (Compiled frame)
>> - org.apache.http.impl.client.DefaultRequestDirector.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=259, line=456 (Compiled frame)
>> - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=344, line=906 (Compiled frame)
>> - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest, org.apache.http.protocol.HttpContext) @bci=21, line=805 (Compiled frame)
>> - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest) @bci=6, line=784 (Compiled frame)
>> - org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest, org.apache.solr.client.solrj.ResponseParser) @bci=1175, line=395 (Interpreted frame)
>> - org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest) @bci=17, line=199 (Compiled frame)
>> - org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(org.apache.solr.client.solrj.impl.LBHttpSolrServer$Req) @bci=132, line=285 (Interpreted frame)
>> - org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(org.apache.solr.client.solrj.request.QueryRequest, java.util.List) @bci=13, line=214 (Compiled frame)
>> - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=246, line=161 (Compiled frame)
>> - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=1, line=118 (Interpreted frame)
>> - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 (Interpreted frame)
>> - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame)
>> - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=471 (Interpreted frame)
>> - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 (Interpreted frame)
>> - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame)
>> - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Compiled frame)
>> - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)
>> - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame)
>> 
>> I've checked I/O statistics, GC working, memory usage, networking and
>> all of that - those resources are not exhausted during the test.
>> 
>> Hard autocommit is set to 15 seconds with openSearcher=false and
>> softAutocommit to 4 hours. We have a fairly high query rate, but until
>> we start indexing everything runs smooth.
>> 
>> Has anyone encountered similar behavior?
>> 
>> -- 
>> Regards,
>> Rafał Kuć
>> Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch
>> 


Re: SolrCloud 4.6.1 hanging

Posted by Mark Miller <ma...@gmail.com>.
I'm looking into a hang as well - not sure of it involves searching as well, but it may. Can you file a JIRA issue - let's track it down. 

- Mark

> On Mar 28, 2014, at 8:07 PM, Rafał Kuć <r....@solr.pl> wrote:
> 
> Hello!
> 
> I have an issue with one of the SolrCloud deployments and I wanted to
> ask maybe someone had a similar issue. Six machines, a collection with
> 6 shards with a replication factor of 3. It all runs on 6 physical
> servers, each with 24 cores. We've indexed about 32 milion documents
> and everything was fine until that point.
> 
> Now, during performance tests, we run into an issue - SolrCloud hangs
> when querying and indexing is run at the same time. First we see a
> normal load on the machines, than the load starts to drop and thread
> dump shown numerous threads like this:
> 
> Thread 12624: (state = BLOCKED)
> - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
> - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Compiled frame)
> - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=2043 (Compiled frame)
> - org.apache.http.pool.PoolEntryFuture.await(java.util.Date) @bci=50, line=131 (Compiled frame)
> - org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(java.lang.Object, java.lang.Object, long, java.util.concurrent.TimeUnit, org.apache.http.pool.PoolEntryFuture) @bci=431, line=281 (Compiled frame)
> - org.apache.http.pool.AbstractConnPool.access$000(org.apache.http.pool.AbstractConnPool, java.lang.Object, java.lang.Object, long, java.util.concurrent.TimeUnit, org.apache.http.pool.PoolEntryFuture) @bci=8, line=62 (Compiled frame)
> - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, java.util.concurrent.TimeUnit) @bci=15, line=176 (Compiled frame)
> - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, java.util.concurrent.TimeUnit) @bci=3, line=169 (Compiled frame)
> - org.apache.http.pool.PoolEntryFuture.get(long, java.util.concurrent.TimeUnit) @bci=38, line=100 (Compiled frame)
> - org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(java.util.concurrent.Future, long, java.util.concurrent.TimeUnit) @bci=4, line=212 (Compiled frame)
> - org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(long, java.util.concurrent.TimeUnit) @bci=10, line=199 (Compiled frame)
> - org.apache.http.impl.client.DefaultRequestDirector.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=259, line=456 (Compiled frame)
> - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=344, line=906 (Compiled frame)
> - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest, org.apache.http.protocol.HttpContext) @bci=21, line=805 (Compiled frame)
> - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest) @bci=6, line=784 (Compiled frame)
> - org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest, org.apache.solr.client.solrj.ResponseParser) @bci=1175, line=395 (Interpreted frame)
> - org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest) @bci=17, line=199 (Compiled frame)
> - org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(org.apache.solr.client.solrj.impl.LBHttpSolrServer$Req) @bci=132, line=285 (Interpreted frame)
> - org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(org.apache.solr.client.solrj.request.QueryRequest, java.util.List) @bci=13, line=214 (Compiled frame)
> - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=246, line=161 (Compiled frame)
> - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=1, line=118 (Interpreted frame)
> - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 (Interpreted frame)
> - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame)
> - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=471 (Interpreted frame)
> - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 (Interpreted frame)
> - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame)
> - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Compiled frame)
> - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)
> - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame)
> 
> I've checked I/O statistics, GC working, memory usage, networking and
> all of that - those resources are not exhausted during the test.
> 
> Hard autocommit is set to 15 seconds with openSearcher=false and
> softAutocommit to 4 hours. We have a fairly high query rate, but until
> we start indexing everything runs smooth.
> 
> Has anyone encountered similar behavior?
> 
> -- 
> Regards,
> Rafał Kuć
> Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch
>