You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Daisy.Yuan (JIRA)" <ji...@apache.org> on 2017/05/18 11:20:04 UTC
[jira] [Updated] (SOLR-10704) REPLACENODE can make the collection lost data

     [ https://issues.apache.org/jira/browse/SOLR-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daisy.Yuan updated SOLR-10704:
------------------------------
    Description: 
1. Collections' replicas distribution: 
replace-hdfs-coll1 has two shards, each shard has one replica and the index files was stored on hdfs. 
replace-hdfs-coll1_shard1_replica1  on node 192.168.229.219
replace-hdfs-coll1_shard2_replica1  on node 192.168.228.193

replace-hdfs-coll2 has two shards, each shard has two replica and the index files was stored on hdfs.
replace-hdfs-coll2_shard1_replica1  on node 192.168.229.219
replace-hdfs-coll2_shard1_replica2  on node 192.168.229.193

replace-hdfs-coll2_shard2_replica1  on node 192.168.228.193
replace-hdfs-coll2_shard2_replica2  on node 192.168.229.219

replace-local-coll1 has two shards, each shard has one replica and the index files was stored on disk.
replace-local-coll1_shard1_replica1  on node 192.168.228.193
replace-local-coll1_shard2_replica1  on node 192.168.229.219

replace-local-coll2 has two shards, each shard has two replica and the index files was stored on disk.
replace-local-coll2_shard1_replica1  on node 192.168.229.193
replace-local-coll2_shard1_replica2 on node 192.168.229.219

replace-local-coll2_shard2_replica1  on node 192.168.228.193
replace-local-coll2_shard2_replica2  on node 192.168.229.219

2. Execute REPLACENODE to replace node 192.168.229.219 with node 192.168.229.137

3. The REPLACENODE request was executed successfully

4. The target replace-hdfs-coll1_shard1_replica2 does not complete revovering, but the source replace-hdfs-coll1_shard1_replica1 was already be deleted. At last the target revocery failed for the following exception:
2017-05-18 17:08:48,587 | ERROR | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Error while trying to recover. core=replace-hdfs-coll1_shard1_replica2:java.lang.NullPointerException
        at org.apache.solr.update.PeerSync.alreadyInSync(PeerSync.java:339)

5. The main process log messages
log in 193.log, the node 192.168.228.193 is overseer role.
step 1. node 192.168.229.193 recevied the REPLACENODE request
2017-05-18 17:08:32,717 | INFO  | http-nio-21100-exec-6 | Invoked Collection Action :replacenode with params action=REPLACENODE&source=192.168.229.219:21100_solr&wt=json&target=192.168.229.137:21103_solr and sendToOCPQueue=true | org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:203)

step 2. OverseerCollectionConfigSetProcessor get the task msg and process REPLACENODE

step 3.  add replica
2017-05-18 17:08:36,592 | INFO  | OverseerStateUpdate-1225069473835599708-192.168.228.193:21100_solr-n_0000000063 | processMessage: queueSize: 1, message = {
  "core":"replace-hdfs-coll1_shard1_replica2",
  "roles":null,
  "base_url":"http://192.168.229.137:21103/solr",
  "node_name":"192.168.229.137:21103_solr",
  "state":"down",
  "shard":"shard1",
  "collection":"replace-hdfs-coll1",
  "operation":"state"} current state version: 42 | org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:221)
  
 2017-05-18 17:08:40,540 | INFO  | OverseerStateUpdate-1225069473835599708-192.168.228.193:21100_solr-n_0000000063 | processMessage: queueSize: 1, message = {
  "core":"replace-hdfs-coll1_shard1_replica2",
  "core_node_name":"core_node3",
  "dataDir":"hdfs://hacluster//user/solr//SolrServer1/replace-hdfs-coll1/core_node3/data/",
  "roles":null,
  "base_url":"http://192.168.229.137:21103/solr",
  "node_name":"192.168.229.137:21103_solr",
  "state":"recovering",
  "shard":"shard1",
  "collection":"replace-hdfs-coll1",
  "operation":"state",
  "ulogDir":"hdfs://hacluster/user/solr/SolrServer1/replace-hdfs-coll1/core_node3/data/tlog"} current state version: 42 | org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:221)
 step 4.  deletecore
2017-05-18 17:08:47,552 | INFO  | OverseerStateUpdate-1225069473835599708-192.168.228.193:21100_solr-n_0000000063 | processMessage: queueSize: 1, message = {
  "operation":"deletecore",
  "core":"replace-hdfs-coll1_shard1_replica1",
  "node_name":"192.168.229.219:21100_solr",
  "collection":"replace-hdfs-coll1",
  "core_node_name":"core_node2"} current state version: 42 | org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:221)
  
 192.168.229.219 is the source node.
  2017-05-18 17:08:47,484 | INFO  | http-nio-21100-exec-6 | Removing directory before core close: hdfs://hacluster//user/solr//SolrServerAdmin/replace-hdfs-coll1/core_node2/data/index | org.apache.solr.core.CachingDirectoryFactory.closeCacheValue(CachingDirectoryFactory.java:271)
2017-05-18 17:08:47,515 | INFO  | http-nio-21100-exec-6 | Removing directory after core close: hdfs://hacluster//user/solr//SolrServerAdmin/replace-hdfs-coll1/core_node2/data | org.apache.solr.core.CachingDirectoryFactory.close(CachingDirectoryFactory.java:204)

 192.168.229.137 is the target node, but  replace-hdfs-coll1_shard1_replica2 recovering is not finished
 2017-05-18 17:08:48,547 | INFO  | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Attempting to PeerSync from [http://192.168.229.219:21100/solr/replace-hdfs-coll1_shard1_replica1/] - recoveringAfterStartup=[true] | org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:370)
2017-05-18 17:08:48,547 | INFO  | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | PeerSync: core=replace-hdfs-coll1_shard1_replica2 url=http://192.168.229.137:21103/solr START replicas=[http://192.168.229.219:21100/solr/replace-hdfs-coll1_shard1_replica1/] nUpdates=100 | org.apache.solr.update.PeerSync.sync(PeerSync.java:214)
2017-05-18 17:08:48,587 | ERROR | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Error while trying to recover. core=replace-hdfs-coll1_shard1_replica2:java.lang.NullPointerException
        at org.apache.solr.update.PeerSync.alreadyInSync(PeerSync.java:339)
        at org.apache.solr.update.PeerSync.sync(PeerSync.java:222)
        at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:376)
        at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:221)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
 | org.apache.solr.common.SolrException.log(SolrException.java:159)
2017-05-18 17:08:48,587 | INFO  | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Replay not started, or was not successful... still buffering updates. | org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:441)
2017-05-18 17:08:48,587 | ERROR | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Recovery failed - trying again... (0) | org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:478)



  was:
1. Collections' replicas distribution: 
replace-hdfs-coll1 has two shards, each shard has one replica and the index files was stored on hdfs. 
replace-hdfs-coll1_shard1_replica1  on node 192.168.229.219
replace-hdfs-coll1_shard2_replica1  on node 192.168.228.193

replace-hdfs-coll2 has two shards, each shard has two replica and the index files was stored on hdfs.
replace-hdfs-coll2_shard1_replica1  on node 192.168.229.219
replace-hdfs-coll2_shard1_replica2  on node 192.168.229.193

replace-hdfs-coll2_shard2_replica1  on node 192.168.228.193
replace-hdfs-coll2_shard2_replica2  on node 192.168.229.219

replace-local-coll1 has two shards, each shard has one replica and the index files was stored on disk.
replace-local-coll1_shard1_replica1  on node 192.168.228.193
replace-local-coll1_shard2_replica1  on node 192.168.229.219

replace-local-coll2 has two shards, each shard has two replica and the index files was stored on disk.
replace-local-coll2_shard1_replica1  on node 192.168.229.193
replace-local-coll2_shard1_replica2 on node 192.168.229.219

replace-local-coll2_shard2_replica1  on node 192.168.228.193
replace-local-coll2_shard2_replica2  on node 192.168.229.219

2. Execute REPLACENODE to replace node 192.168.229.219 with node 192.168.229.137
3. The REPLACENODE request was executed successfully
4. The target replace-hdfs-coll1_shard1_replica2 does not complete revovering, but the source replace-hdfs-coll1_shard1_replica1 was already be deleted. At last the target revocery failed for the following exception:
2017-05-18 17:08:48,587 | ERROR | recoveryExecutor-3-thread-2-processing-n:189.185.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Error while trying to recover. core=replace-hdfs-coll1_shard1_replica2:java.lang.NullPointerException
        at org.apache.solr.update.PeerSync.alreadyInSync(PeerSync.java:339)
5. The main process log messages
log in 193.log, the node 192.168.228.193 is overseer role.
step 1. node 192.168.229.193 recevied the REPLACENODE request
2017-05-18 17:08:32,717 | INFO  | http-nio-21100-exec-6 | Invoked Collection Action :replacenode with params action=REPLACENODE&source=192.168.229.219:21100_solr&wt=json&target=192.168.229.137:21103_solr and sendToOCPQueue=true | org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:203)

step 2. OverseerCollectionConfigSetProcessor get the task msg and process REPLACENODE

step 3.  add replica
2017-05-18 17:08:36,592 | INFO  | OverseerStateUpdate-1225069473835599708-192.168.228.193:21100_solr-n_0000000063 | processMessage: queueSize: 1, message = {
  "core":"replace-hdfs-coll1_shard1_replica2",
  "roles":null,
  "base_url":"http://192.168.229.137:21103/solr",
  "node_name":"192.168.229.137:21103_solr",
  "state":"down",
  "shard":"shard1",
  "collection":"replace-hdfs-coll1",
  "operation":"state"} current state version: 42 | org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:221)
  
 2017-05-18 17:08:40,540 | INFO  | OverseerStateUpdate-1225069473835599708-192.168.228.193:21100_solr-n_0000000063 | processMessage: queueSize: 1, message = {
  "core":"replace-hdfs-coll1_shard1_replica2",
  "core_node_name":"core_node3",
  "dataDir":"hdfs://hacluster//user/solr//SolrServer1/replace-hdfs-coll1/core_node3/data/",
  "roles":null,
  "base_url":"http://192.168.229.137:21103/solr",
  "node_name":"192.168.229.137:21103_solr",
  "state":"recovering",
  "shard":"shard1",
  "collection":"replace-hdfs-coll1",
  "operation":"state",
  "ulogDir":"hdfs://hacluster/user/solr/SolrServer1/replace-hdfs-coll1/core_node3/data/tlog"} current state version: 42 | org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:221)
 step 4.  deletecore
2017-05-18 17:08:47,552 | INFO  | OverseerStateUpdate-1225069473835599708-192.168.228.193:21100_solr-n_0000000063 | processMessage: queueSize: 1, message = {
  "operation":"deletecore",
  "core":"replace-hdfs-coll1_shard1_replica1",
  "node_name":"192.168.229.219:21100_solr",
  "collection":"replace-hdfs-coll1",
  "core_node_name":"core_node2"} current state version: 42 | org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:221)
  
 192.168.229.219 is the source node.
  2017-05-18 17:08:47,484 | INFO  | http-nio-21100-exec-6 | Removing directory before core close: hdfs://hacluster//user/solr//SolrServerAdmin/replace-hdfs-coll1/core_node2/data/index | org.apache.solr.core.CachingDirectoryFactory.closeCacheValue(CachingDirectoryFactory.java:271)
2017-05-18 17:08:47,515 | INFO  | http-nio-21100-exec-6 | Removing directory after core close: hdfs://hacluster//user/solr//SolrServerAdmin/replace-hdfs-coll1/core_node2/data | org.apache.solr.core.CachingDirectoryFactory.close(CachingDirectoryFactory.java:204)

 192.168.229.137 is the target node, but  replace-hdfs-coll1_shard1_replica2 recovering is not finished
 2017-05-18 17:08:48,547 | INFO  | recoveryExecutor-3-thread-2-processing-n:189.185.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Attempting to PeerSync from [http://189.185.229.219:21100/solr/replace-hdfs-coll1_shard1_replica1/] - recoveringAfterStartup=[true] | org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:370)
2017-05-18 17:08:48,547 | INFO  | recoveryExecutor-3-thread-2-processing-n:189.185.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | PeerSync: core=replace-hdfs-coll1_shard1_replica2 url=http://189.185.229.137:21103/solr START replicas=[http://189.185.229.219:21100/solr/replace-hdfs-coll1_shard1_replica1/] nUpdates=100 | org.apache.solr.update.PeerSync.sync(PeerSync.java:214)
2017-05-18 17:08:48,587 | ERROR | recoveryExecutor-3-thread-2-processing-n:189.185.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Error while trying to recover. core=replace-hdfs-coll1_shard1_replica2:java.lang.NullPointerException
        at org.apache.solr.update.PeerSync.alreadyInSync(PeerSync.java:339)
        at org.apache.solr.update.PeerSync.sync(PeerSync.java:222)
        at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:376)
        at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:221)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
 | org.apache.solr.common.SolrException.log(SolrException.java:159)
2017-05-18 17:08:48,587 | INFO  | recoveryExecutor-3-thread-2-processing-n:189.185.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Replay not started, or was not successful... still buffering updates. | org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:441)
2017-05-18 17:08:48,587 | ERROR | recoveryExecutor-3-thread-2-processing-n:189.185.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Recovery failed - trying again... (0) | org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:478)

 


> REPLACENODE can make the collection lost data
> ---------------------------------------------
>
>                 Key: SOLR-10704
>                 URL: https://issues.apache.org/jira/browse/SOLR-10704
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 6.2
>         Environment: Red Hat 4.8.3-9, JDK 1.8.0_121
>            Reporter: Daisy.Yuan
>
> 1. Collections' replicas distribution: 
> replace-hdfs-coll1 has two shards, each shard has one replica and the index files was stored on hdfs. 
> replace-hdfs-coll1_shard1_replica1  on node 192.168.229.219
> replace-hdfs-coll1_shard2_replica1  on node 192.168.228.193
> replace-hdfs-coll2 has two shards, each shard has two replica and the index files was stored on hdfs.
> replace-hdfs-coll2_shard1_replica1  on node 192.168.229.219
> replace-hdfs-coll2_shard1_replica2  on node 192.168.229.193
> replace-hdfs-coll2_shard2_replica1  on node 192.168.228.193
> replace-hdfs-coll2_shard2_replica2  on node 192.168.229.219
> replace-local-coll1 has two shards, each shard has one replica and the index files was stored on disk.
> replace-local-coll1_shard1_replica1  on node 192.168.228.193
> replace-local-coll1_shard2_replica1  on node 192.168.229.219
> replace-local-coll2 has two shards, each shard has two replica and the index files was stored on disk.
> replace-local-coll2_shard1_replica1  on node 192.168.229.193
> replace-local-coll2_shard1_replica2 on node 192.168.229.219
> replace-local-coll2_shard2_replica1  on node 192.168.228.193
> replace-local-coll2_shard2_replica2  on node 192.168.229.219
> 2. Execute REPLACENODE to replace node 192.168.229.219 with node 192.168.229.137
> 3. The REPLACENODE request was executed successfully
> 4. The target replace-hdfs-coll1_shard1_replica2 does not complete revovering, but the source replace-hdfs-coll1_shard1_replica1 was already be deleted. At last the target revocery failed for the following exception:
> 2017-05-18 17:08:48,587 | ERROR | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Error while trying to recover. core=replace-hdfs-coll1_shard1_replica2:java.lang.NullPointerException
>         at org.apache.solr.update.PeerSync.alreadyInSync(PeerSync.java:339)
> 5. The main process log messages
> log in 193.log, the node 192.168.228.193 is overseer role.
> step 1. node 192.168.229.193 recevied the REPLACENODE request
> 2017-05-18 17:08:32,717 | INFO  | http-nio-21100-exec-6 | Invoked Collection Action :replacenode with params action=REPLACENODE&source=192.168.229.219:21100_solr&wt=json&target=192.168.229.137:21103_solr and sendToOCPQueue=true | org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:203)
> step 2. OverseerCollectionConfigSetProcessor get the task msg and process REPLACENODE
> step 3.  add replica
> 2017-05-18 17:08:36,592 | INFO  | OverseerStateUpdate-1225069473835599708-192.168.228.193:21100_solr-n_0000000063 | processMessage: queueSize: 1, message = {
>   "core":"replace-hdfs-coll1_shard1_replica2",
>   "roles":null,
>   "base_url":"http://192.168.229.137:21103/solr",
>   "node_name":"192.168.229.137:21103_solr",
>   "state":"down",
>   "shard":"shard1",
>   "collection":"replace-hdfs-coll1",
>   "operation":"state"} current state version: 42 | org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:221)
>   
>  2017-05-18 17:08:40,540 | INFO  | OverseerStateUpdate-1225069473835599708-192.168.228.193:21100_solr-n_0000000063 | processMessage: queueSize: 1, message = {
>   "core":"replace-hdfs-coll1_shard1_replica2",
>   "core_node_name":"core_node3",
>   "dataDir":"hdfs://hacluster//user/solr//SolrServer1/replace-hdfs-coll1/core_node3/data/",
>   "roles":null,
>   "base_url":"http://192.168.229.137:21103/solr",
>   "node_name":"192.168.229.137:21103_solr",
>   "state":"recovering",
>   "shard":"shard1",
>   "collection":"replace-hdfs-coll1",
>   "operation":"state",
>   "ulogDir":"hdfs://hacluster/user/solr/SolrServer1/replace-hdfs-coll1/core_node3/data/tlog"} current state version: 42 | org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:221)
>  step 4.  deletecore
> 2017-05-18 17:08:47,552 | INFO  | OverseerStateUpdate-1225069473835599708-192.168.228.193:21100_solr-n_0000000063 | processMessage: queueSize: 1, message = {
>   "operation":"deletecore",
>   "core":"replace-hdfs-coll1_shard1_replica1",
>   "node_name":"192.168.229.219:21100_solr",
>   "collection":"replace-hdfs-coll1",
>   "core_node_name":"core_node2"} current state version: 42 | org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:221)
>   
>  192.168.229.219 is the source node.
>   2017-05-18 17:08:47,484 | INFO  | http-nio-21100-exec-6 | Removing directory before core close: hdfs://hacluster//user/solr//SolrServerAdmin/replace-hdfs-coll1/core_node2/data/index | org.apache.solr.core.CachingDirectoryFactory.closeCacheValue(CachingDirectoryFactory.java:271)
> 2017-05-18 17:08:47,515 | INFO  | http-nio-21100-exec-6 | Removing directory after core close: hdfs://hacluster//user/solr//SolrServerAdmin/replace-hdfs-coll1/core_node2/data | org.apache.solr.core.CachingDirectoryFactory.close(CachingDirectoryFactory.java:204)
>  192.168.229.137 is the target node, but  replace-hdfs-coll1_shard1_replica2 recovering is not finished
>  2017-05-18 17:08:48,547 | INFO  | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Attempting to PeerSync from [http://192.168.229.219:21100/solr/replace-hdfs-coll1_shard1_replica1/] - recoveringAfterStartup=[true] | org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:370)
> 2017-05-18 17:08:48,547 | INFO  | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | PeerSync: core=replace-hdfs-coll1_shard1_replica2 url=http://192.168.229.137:21103/solr START replicas=[http://192.168.229.219:21100/solr/replace-hdfs-coll1_shard1_replica1/] nUpdates=100 | org.apache.solr.update.PeerSync.sync(PeerSync.java:214)
> 2017-05-18 17:08:48,587 | ERROR | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Error while trying to recover. core=replace-hdfs-coll1_shard1_replica2:java.lang.NullPointerException
>         at org.apache.solr.update.PeerSync.alreadyInSync(PeerSync.java:339)
>         at org.apache.solr.update.PeerSync.sync(PeerSync.java:222)
>         at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:376)
>         at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:221)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>  | org.apache.solr.common.SolrException.log(SolrException.java:159)
> 2017-05-18 17:08:48,587 | INFO  | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Replay not started, or was not successful... still buffering updates. | org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:441)
> 2017-05-18 17:08:48,587 | ERROR | recoveryExecutor-3-thread-2-processing-n:192.168.229.137:21103_solr x:replace-hdfs-coll1_shard1_replica2 s:shard1 c:replace-hdfs-coll1 r:core_node3 | Recovery failed - trying again... (0) | org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:478)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org