You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2017/03/07 00:16:32 UTC
[jira] [Created] (SOLR-10234) "Too many open files" in distrib
tests due to fixed HandleLimitFS (regardless of num nodes in test)
Hoss Man created SOLR-10234:
-------------------------------
Summary: "Too many open files" in distrib tests due to fixed HandleLimitFS (regardless of num nodes in test)
Key: SOLR-10234
URL: https://issues.apache.org/jira/browse/SOLR-10234
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hoss Man
I just got an failure from BasicDistributedZkTest on master (acb185b2dc7522e6a4fa55d54e82910736668f8d) that caught my attention -- the reported failure was "Remote error message: Exception writing document id 57 to the index; possible analysis error.", but digging intothe logs the root cause was "Too many open files" coming from the mock
{{HandleLimitFS}} class we have...
{noformat}
[junit4] 2> 495598 ERROR (qtp155652658-4405) [ ] o.a.s.h.RequestHandlerBase java.nio.file.FileSystemException: /home/jenkins/lucene-solr/solr/build/solr-core/test/J1/temp/solr.cloud.BasicDistributedZkTest_8D04773C07230D3B-001/index-NIOFSDirectory-002/_o_Memory_0.mdvm: Too many open files
[junit4] 2> at org.apache.lucene.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:48)
[junit4] 2> at org.apache.lucene.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:81)
[junit4] 2> at org.apache.lucene.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:160)
[junit4] 2> at java.base/java.nio.file.Files.newOutputStream(Files.java:218)
[junit4] 2> at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:413)
[junit4] 2> at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:409)
[junit4] 2> at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
[junit4] 2> at org.apache.lucene.store.MockDirectoryWrapper.createOutput(MockDirectoryWrapper.java:665)
...
[junit4] 2> NOTE: reproduce with: ant test -Dtestcase=BasicDistributedZkTest -Dtests.method=test -Dtests.seed=8D04773C07230D3B -Dtests.slow=true -Dtests.locale=en-ER -Dtests.timezone=Europe/Volgograd -Dtests.asserts=true -Dtests.file.encoding=UTF-8
[junit4] ERROR 259s J1 | BasicDistributedZkTest.test <<<
{noformat}
...what concerns me in particular about this is is that it's coming from a distributed test, involving many multiple "nodes" (all using the same randomized similarity) writting to the same "file://" filesystem in the same JVM -- but {{TestRuleTemporaryFilesCleanup}} seems to be initializing the filesystem with a fixed {{MAX_OPEN_FILES = 2048}}
So perhaps all (distributed/cloud) Solr tests should use {{SuppressFileSystems}} to ensure we don't get false failures like this?
Or perhaps we should enhance the way we use {{HandleLimitFS}} in our test scaffolding so that we can give each solr node it's own mock filesystem? (with it's own MAX_OPEN_FILES limit?)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org