You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2014/01/03 00:57:50 UTC
[jira] [Commented] (SOLR-5601) NPE in
OverseerCollectionProcessor.migrateKey - inconsistent failure from
MigrateRouteKeyTest
[ https://issues.apache.org/jira/browse/SOLR-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860972#comment-13860972 ]
Hoss Man commented on SOLR-5601:
--------------------------------
The initial start of trouble seened to me an NPE in OverseerCollectionProcessor.migrateKey...
{noformat}
[junit4] 2> 324502 T1186 oasc.ZkController.checkRecovery I am the leader, no recovery necessary
[junit4] 2> 324502 T1186 oasc.ZkController.publish publishing core=split_shard2_temp_shard1_shard1_replica1 state=active
[junit4] 2> 324514 T1228 C220 P42114 oasup.LogUpdateProcessor.finish [collection1] webapp=/tr_/xq path=/update params={distrib.from=http://127.0.0.1:47570/tr_/xq/collection1/&version=2&wt=javabin&update.distrib=FROMLEADER} {add=[m/1!116 (1456162963872284672)]} 0 0
[junit4] 2> 324515 T1189 C221 P47570 oasup.LogUpdateProcessor.finish [collection1] webapp=/tr_/xq path=/update params={version=2&wt=javabin} {add=[m/1!116 (1456162963872284672)]} 0 3
[junit4] 2> 324526 T1186 oascc.ZkStateReader.updateClusterState Updating cloud state from ZooKeeper...
[junit4] 2> 324527 T1186 oasc.SolrXMLCoresLocator.doPersist Persisted core descriptions to /home/hossman/lucene/dev/solr/build/solr-core/test/J2/../../../../../../../../../home/hossman/lucene/dev/solr/build/solr-core/test/J2/./org.apache.solr.cloud.MigrateRouteKeyTest-jetty1-1388705195623/solr.xml
[junit4] 2> 324527 T1186 oass.SolrDispatchFilter.handleAdminRequest [admin] webapp=null path=/admin/cores params={numShards=1&collection.configName=conf1&version=2&wt=javabin&name=split_shard2_temp_shard1_shard1_replica1&shard=shard1&collection=split_shard2_temp_shard1&qt=/admin/cores&action=CREATE} status=0 QTime=1656
[junit4] 2> 324528 T1174 oasc.OverseerCollectionProcessor.createCollection Finished create command on all shards for collection: split_shard2_temp_shard1
[junit4] 2> 324528 T1174 oasc.OverseerCollectionProcessor.migrateKey Asking source leader to wait for: split_shard2_temp_shard1_shard1_replica1 to be alive on: 127.0.0.1:47570_tr_%2Fxq
[junit4] 2> 324528 T1174 oasc.SolrException.log ERROR Collection migrate of migrate failed:java.lang.NullPointerException
[junit4] 2> at org.apache.solr.cloud.OverseerCollectionProcessor.migrateKey(OverseerCollectionProcessor.java:1208)
[junit4] 2> at org.apache.solr.cloud.OverseerCollectionProcessor.migrate(OverseerCollectionProcessor.java:1102)
[junit4] 2> at org.apache.solr.cloud.OverseerCollectionProcessor.processMessage(OverseerCollectionProcessor.java:252)
[junit4] 2> at org.apache.solr.cloud.OverseerCollectionProcessor.run(OverseerCollectionProcessor.java:176)
[junit4] 2> at java.lang.Thread.run(Thread.java:724)
[junit4] 2>
[junit4] 2> 324551 T1217 oasc.DistributedQueue$LatchChildWatcher.process LatchChildWatcher fired on path: /overseer/collection-queue-work/qnr-0000000002 state: SyncConnected type NodeDataChanged
[junit4] 2> 324552 T1172 oasc.DistributedQueue$LatchChildWatcher.process LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
[junit4] 2> 324552 T1172 oasc.DistributedQueue$LatchChildWatcher.process LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
[junit4] 2> 324559 T1172 oasc.DistributedQueue$LatchChildWatcher.process LatchChildWatcher fired on path: /overseer/collection-queue-work state: SyncConnected type NodeChildrenChanged
[junit4] 2> 324559 T1174 oasc.OverseerCollectionProcessor.run Overseer Collection Processor: Message id:/overseer/collection-queue-work/qn-0000000002 complete, response:{success={null={responseHeader={status=0,QTime=0},core=migrate_multipleshardtest_targetCollection_shard1_replica1,status=BUFFERING},null={responseHeader={status=0,QTime=1656},core=split_shard2_temp_shard1_shard1_replica1}},Operation migrate caused exception:=java.lang.NullPointerException,exception={msg=null,rspCode=-1}}
[junit4] 2> 324568 T1173 oascc.ZkStateReader.updateClusterState Updating cloud state from ZooKeeper...
[junit4] 2> 324569 T1173 oasc.Overseer$ClusterStateUpdater.updateState Update state numShards=1 message={
[junit4] 2> "operation":"state",
[junit4] 2> "state":"active",
[junit4] 2> "base_url":"http://127.0.0.1:47570/tr_/xq",
[junit4] 2> "core":"split_shard2_temp_shard1_shard1_replica1",
[junit4] 2> "roles":null,
[junit4] 2> "node_name":"127.0.0.1:47570_tr_%2Fxq",
[junit4] 2> "shard":"shard1",
[junit4] 2> "shard_range":null,
[junit4] 2> "shard_state":"active",
[junit4] 2> "shard_parent":null,
[junit4] 2> "collection":"split_shard2_temp_shard1",
[junit4] 2> "numShards":"1",
[junit4] 2> "core_node_name":"core_node1"}
[junit4] 2> 324570 T1209 oass.SolrDispatchFilter.handleAdminRequest [admin] webapp=null path=/admin/collections params={split.key=a/1!&version=2&wt=javabin&forward.timeout=45&collection=collection1&target.collection=migrate_multipleshardtest_targetCollection&action=MIGRATE} status=500 QTime=2211
[junit4] 2> 324571 T1209 oasc.SolrException.log ERROR null:org.apache.solr.common.SolrException
[junit4] 2> at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:204)
[junit4] 2> at org.apache.solr.handler.admin.CollectionsHandler.handleMigrate(CollectionsHandler.java:450)
[junit4] 2> at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:174)
[junit4] 2> at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
[junit4] 2> at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:705)
[junit4] 2> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
[junit4] 2> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:200)
[junit4] 2> at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
[junit4] 2> at org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:136)
[junit4] 2> at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
[junit4] 2> at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
[junit4] 2> at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
[junit4] 2> at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
[junit4] 2> at org.eclipse.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301)
[junit4] 2> at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1077)
[junit4] 2> at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
[junit4] 2> at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
[junit4] 2> at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
[junit4] 2> at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
[junit4] 2> at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
[junit4] 2> at org.eclipse.jetty.server.Server.handle(Server.java:368)
[junit4] 2> at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
[junit4] 2> at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
[junit4] 2> at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
[junit4] 2> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
[junit4] 2> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
[junit4] 2> at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
[junit4] 2> at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628)
[junit4] 2> at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
[junit4] 2> at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
[junit4] 2> at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
[junit4] 2> at java.lang.Thread.run(Thread.java:724)
[junit4] 2>
[junit4] 2> 324572 T1249 C222 P46266 oasup.LogUpdateProcessor.finish [collection1] webapp=/tr_/xq path=/update params={distrib.from=http://127.0.0.1:52363/tr_/xq/collection1/&version=2&wt=javabin&update.distrib=FROMLEADER} {add=[n/1!117 (1456162963931004928)]} 0 0
[junit4] 2> 324573 T1207 C223 P52363 oasup.LogUpdateProcessor.finish [collection1] webapp=/tr_/xq path=/update params={version=2&wt=javabin} {add=[n/1!117 (1456162963931004928)]} 0 5
[junit4] 2> 324585 T1146 oasc.ChaosMonkey.monkeyLog monkey: stop shard! 36682
[junit4] 2> 324585 T1146 oasc.CoreContainer.shutdown Shutting down CoreContainer instance=2033417453
[junit4] 2> 324586 T1146 oasc.ZkController.publish publishing core=collection1 state=down
[junit4] 2> 324586 T1146 oasc.ZkController.publish numShards not found on descriptor - reading it from system property
[junit4] 2> 324617 T1172 oasc.DistributedQueue$LatchChildWatcher.process LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
{noformat}
...which then cascaded into an "IllegalStateException: Connection pool shut down"...
{noformat}
[junit4] 2> 324711 T1281 oascc.ZkStateReader.createClusterStateWatchersAndUpdate Updating cluster state from ZooKeeper...
[junit4] 2> 324718 T1197 oascc.SolrZkClient.makePath makePath: /overseer_elect/leader
[junit4] 2> 324722 T1146 oejsh.ContextHandler.doStop stopped o.e.j.s.ServletContextHandler{/tr_/xq,null}
[junit4] 2> 324726 T1197 oasc.Overseer.start Overseer (id=91010183481589766-127.0.0.1:47570_tr_%2Fxq-n_0000000001) starting
[junit4] 2> 324727 T1281 oasc.MigrateRouteKeyTest$Indexer.run ERROR Exception while adding document id: id=o/1!118 org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.SolrServerException: java.lang.IllegalStateException: Connection pool shut down
[junit4] 2> at org.apache.solr.client.solrj.impl.CloudSolrServer.directUpdate(CloudSolrServer.java:361)
[junit4] 2> at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:510)
[junit4] 2> at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
[junit4] 2> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
[junit4] 2> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
[junit4] 2> at org.apache.solr.cloud.MigrateRouteKeyTest$Indexer.run(MigrateRouteKeyTest.java:252)
[junit4] 2> Caused by: org.apache.solr.client.solrj.SolrServerException: java.lang.IllegalStateException: Connection pool shut down
[junit4] 2> at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:310)
[junit4] 2> at org.apache.solr.client.solrj.impl.CloudSolrServer.directUpdate(CloudSolrServer.java:358)
[junit4] 2> ... 5 more
[junit4] 2> Caused by: java.lang.IllegalStateException: Connection pool shut down
[junit4] 2> at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:167)
[junit4] 2> at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:199)
[junit4] 2> at org.apache.http.impl.conn.PoolingClientConnectionManager.requestConnection(PoolingClientConnectionManager.java:188)
[junit4] 2> at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:448)
[junit4] 2> at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
[junit4] 2> at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
[junit4] 2> at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
[junit4] 2> at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:395)
[junit4] 2> at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199)
[junit4] 2> at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:283)
[junit4] 2> ... 6 more
[junit4] 2>
{noformat}
...lots more "IllegalStateException: Connection pool shut down" for various doc add attempts, ultimatly leading to...
{noformat}
[junit4] 2> NOTE: reproduce with: ant test -Dtestcase=MigrateRouteKeyTest -Dtests.method=testDistribSearch -Dtests.seed=26F9FF92782251B6 -Dtests.slow=true -Dtests.locale=sr_CS -Dtests.timezone=Europe/Zaporozhye -Dtests.file.encoding=UTF-8
[junit4] ERROR 28.8s J2 | MigrateRouteKeyTest.testDistribSearch <<<
[junit4] > Throwable #1: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Server Error
[junit4] > request: http://127.0.0.1:52363/tr_/xq/admin/collections?action=MIGRATE&collection=collection1&target.collection=migrate_multipleshardtest_targetCollection&split.key=a/1!&forward.timeout=45&wt=javabin&version=2
[junit4] > at __randomizedtesting.SeedInfo.seed([26F9FF92782251B6:A71F718A0F7D318A]:0)
[junit4] > at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:495)
[junit4] > at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199)
[junit4] > at org.apache.solr.cloud.MigrateRouteKeyTest.invoke(MigrateRouteKeyTest.java:139)
[junit4] > at org.apache.solr.cloud.MigrateRouteKeyTest.invokeMigrateApi(MigrateRouteKeyTest.java:125)
[junit4] > at org.apache.solr.cloud.MigrateRouteKeyTest.multipleShardMigrateTest(MigrateRouteKeyTest.java:198)
[junit4] > at org.apache.solr.cloud.MigrateRouteKeyTest.doTest(MigrateRouteKeyTest.java:93)
[junit4] > at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:867)
[junit4] > at java.lang.Thread.run(Thread.java:724)
{noformat}
...and then...
{noformat}
[junit4] 2> 50947 T1145 ccr.ThreadLeakControl.checkThreadLeaks SEVERE 4 threads leaked from SUITE scope at org.apache.solr.cloud.MigrateRouteKeyTest:
[junit4] 2> 1) Thread[id=1298, name=Timer-98, state=TIMED_WAITING, group=TGRP-MigrateRouteKeyTest]
[junit4] 2> at java.lang.Object.wait(Native Method)
[junit4] 2> at java.util.TimerThread.mainLoop(Timer.java:552)
[junit4] 2> at java.util.TimerThread.run(Timer.java:505)
[junit4] 2> 2) Thread[id=1299, name=Thread-532-SendThread(localhost:46743), state=TIMED_WAITING, group=TGRP-MigrateRouteKeyTest]
[junit4] 2> at java.lang.Thread.sleep(Native Method)
[junit4] 2> at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:86)
[junit4] 2> at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:937)
[junit4] 2> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:993)
[junit4] 2> 3) Thread[id=1300, name=Thread-532-EventThread, state=WAITING, group=TGRP-MigrateRouteKeyTest]
[junit4] 2> at sun.misc.Unsafe.park(Native Method)
[junit4] 2> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
[junit4] 2> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
[junit4] 2> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
[junit4] 2> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491)
[junit4] 2> 4) Thread[id=1281, name=Thread-532, state=TIMED_WAITING, group=TGRP-MigrateRouteKeyTest]
[junit4] 2> at java.lang.Object.wait(Native Method)
[junit4] 2> at org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:204)
[junit4] 2> at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:139)
[junit4] 2> at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:98)
[junit4] 2> at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:88)
[junit4] 2> at org.apache.solr.common.cloud.ZkStateReader.<init>(ZkStateReader.java:183)
[junit4] 2> at org.apache.solr.client.solrj.impl.CloudSolrServer.connect(CloudSolrServer.java:240)
[junit4] 2> at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:501)
[junit4] 2> at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
[junit4] 2> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
[junit4] 2> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
[junit4] 2> at org.apache.solr.cloud.MigrateRouteKeyTest$Indexer.run(MigrateRouteKeyTest.java:252)
{noformat}
> NPE in OverseerCollectionProcessor.migrateKey - inconsistent failure from MigrateRouteKeyTest
> ---------------------------------------------------------------------------------------------
>
> Key: SOLR-5601
> URL: https://issues.apache.org/jira/browse/SOLR-5601
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
>
> Encountered a weird _semi-reproducible_ failure from MigrateRouteKeyTest....
> {code}
> ant test -Dtestcase=MigrateRouteKeyTest -Dtests.method=testDistribSearch -Dtests.seed=26F9FF92782251B6 -Dtests.slow=true -Dtests.locale=sr_CS -Dtests.timezone=Europe/Zaporozhye -Dtests.file.encoding=UTF-8
> {code}
> Attempting to reproduce with that line caused failures the first 2 times i tried, and then succeeded every time after that.
> Details of failure to follow in comment.
> (NOTE: this may be related to some recent jenkins failures due to thread leaks from this test)
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org