You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Daisy.Yuan (JIRA)" <ji...@apache.org> on 2016/12/10 09:04:58 UTC
[jira] [Comment Edited] (SOLR-9830) Once IndexWriter is closed due
to some RunTimeException like FileSystemException, It never return to
normal unless restart the Solr JVM
[ https://issues.apache.org/jira/browse/SOLR-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15737533#comment-15737533 ]
Daisy.Yuan edited comment on SOLR-9830 at 12/10/16 9:04 AM:
------------------------------------------------------------
I do some reliability test to make Solr more stronger.
I try to do falut injection to consume the system handle resource to reach the the system handle resource upper limit.
When I stop falut injection, the number of used system handle is back to under the system handle resource upper limit.
I hope the Solr can return to normal automaticly, but Solr cannot be back to noramal and the data is inconsistent between some shards' replicas.
I try to reload the collection, but it is failed due to : org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:740)
protected final void ensureOpen(boolean failIfClosing) throws AlreadyClosedException {
if (closed || (failIfClosing && closing)) {
throw new AlreadyClosedException("this IndexWriter is closed", tragedy);
}
}
isColsed = true failIfClosing = true, and the tragedy is the old solr core's exception.
It's only back to nomal after a restart of the Solr Jvm.
was (Author: daisy_yu):
I do some reliability test to make Solr more stronger.
I try to do falut injection to consume the system handle resource to reach the the system handle resource upper limit.
When I stop falut injection, the number of used system handle is back to under the system handle resource upper limit.
I hope the Solr can return to normal automaticly, but Solr cannot be back to noramal and the data is inconsistent between some shards' replicas.
I try to reload the collection, but it is failed due to the IndexWriter is already closed.
It's only back to nomal after a restart of the Solr Jvm.
> Once IndexWriter is closed due to some RunTimeException like FileSystemException, It never return to normal unless restart the Solr JVM
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-9830
> URL: https://issues.apache.org/jira/browse/SOLR-9830
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: update
> Affects Versions: 6.2
> Environment: Red Hat 4.4.7-3,SolrCloud
> Reporter: Daisy.Yuan
>
> 1. Collection coll_test, has 9 shards, each has two replicas in different solr instances.
> 2. When update documens to the collection use Solrj, inject the exhausted handle fault to one solr instance like solr1.
> 3. Update to col_test_shard3_replica1(It's leader) is failed due to FileSystemException, and IndexWriter is closed.
> 4. And clear the fault, the col_test_shard3_replica1 (is leader) is always cannot be updated documens and the numDocs is always less than the standby replica.
> 5. After Solr instance restart, It can update documens and the numDocs is consistent between the two replicas.
> I think in this case in Solr Cloud mode, it should recovery itself and not restart to recovery the solrcore update function.
> 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | [DWPT][http-nio-21101-exec-20]: now abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | [DWPT][http-nio-21101-exec-20]: done abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: hit exception updating document | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: hit tragic FileSystemException inside updateDocument | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: rollback | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: all running merges have aborted | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: rollback: done finish merges | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 | [DW][http-nio-21101-exec-20]: abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,939 | INFO | commitScheduler-46-thread-1 | [DWPT][commitScheduler-46-thread-1]: flush postings as segment _4h9 numDocs=3798 | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [DWPT][commitScheduler-46-thread-1]: now abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [DWPT][commitScheduler-46-thread-1]: done abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 | [DW][http-nio-21101-exec-20]: done abort success=true | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [DW][commitScheduler-46-thread-1]: commitScheduler-46-thread-1 finishFullFlush success=false | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: rollback: infos=_4g7(6.2.0):C59169/23684:delGen=4 _4gq(6.2.0):C67474/11636:delGen=1 _4gg(6.2.0):C64067/15664:delGen=2 _4gr(6.2.0):C13131 _4gs(6.2.0):C966 _4gt(6.2.0):C4543 _4gu(6.2.0):C6960 _4gv(6.2.0):C2544 | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [IW][commitScheduler-46-thread-1]: hit exception during NRT reader | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,967 | INFO | http-nio-21101-exec-20 | [col_test_shard3_replica1] webapp=/solr path=/update params={wt=javabin&version=2}{add=[5____5 (1552493084330164224), 24____5 (1552493084330164225), 28____5 (1552493084331212800), 32____5 (1552493084331212801), 44____5 (1552493084331212802), 46____5 (1552493084331212803), 64____5 (1552493084331212804), 94____5 (1552493084331212805), 100____5 (1552493084331212806), 119____5 (1552493084331212807), ... (74 adds)]} 0 43 | org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:187)
> at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2143)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:695)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:471)
> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:450)
> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:400)
> at org.apache.solr.servlet.SolrAuthorizationFilter.doFilter(SolrAuthorizationFilter.java:195)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at com.huawei.solr.security.check.SolrParaCheckFilter.doFilter(SolrParaCheckFilter.java:201)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at com.huawei.solr.security.audit.AuditFilter.doFilter(AuditFilter.java:145)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at com.huawei.solr.security.auth.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:611)
> at com.huawei.solr.security.auth.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:578)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at com.huawei.solr.security.auth.cas.HttpServletRequestWrapperFilterWrapper.doFilter(HttpServletRequestWrapperFilterWrapper.java:37)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at com.huawei.solr.security.auth.cas.Cas20ProxyReceivingTicketValidationFilterWrapper.doFilter(Cas20ProxyReceivingTicketValidationFilterWrapper.java:71)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at com.huawei.solr.security.auth.cas.Cas20AuthenticationFilterWrapper.doFilter(Cas20AuthenticationFilterWrapper.java:60)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at com.huawei.solr.security.auth.cas.LogoutFilter.doFilter(LogoutFilter.java:84)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at com.huawei.solr.monitor.MemMonitorFilter.doFilter(MemMonitorFilter.java:81)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at com.huawei.solr.security.auth.ServerRealmFilter.doFilter(ServerRealmFilter.java:55)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at com.huawei.solr.security.auth.RerouteRequestFilter.doFilter(RerouteRequestFilter.java:58)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:218)
> at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
> at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)
> at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
> at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
> at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
> at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:442)
> at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1083)
> at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:640)
> at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1756)
> at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1715)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
> at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:740)
> at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:754)
> at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1558)
> at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:279)
> at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:211)
> at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:166)
> ... 73 more
> Caused by: java.nio.file.FileSystemException: /srv/BigData/solr/solrserveradmin/col_test_shard3_replica1/data/index/_4ha.fdx: Too many open files in system
> at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
> at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
> at java.nio.file.Files.newOutputStream(Files.java:216)
> at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:413)
> at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:409)
> at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
> at org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:44)
> at org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43)
> at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:108)
> at org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:128)
> at org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.fieldsWriter(Lucene50StoredFieldsFormat.java:183)
> at org.apache.lucene.index.DefaultIndexingChain.initStoredFieldsWriter(DefaultIndexingChain.java:83)
> at org.apache.lucene.index.DefaultIndexingChain.startStoredFields(DefaultIndexingChain.java:331)
> at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:368)
> at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:231)
> at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:478)
> at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1562)
> ... 76 more
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org