You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Marc Campeau <ca...@gmail.com> on 2014/07/16 22:44:35 UTC

Shard Replicas not getting replicated data from leader

Hi,

I have setup 4 Solr (4.9.0) Nodes into a single shard for a given
collection, meaning I should have 4 replicated nodes. I have 3 Zookeepers
in ensemble managing the configs for this collection. I have a load
balancer in front of the 4 nodes to split traffic between them.

I start this collection with an empty data/index directory.

When I send /update requests to the load balancers I see these going to all
4 nodes. Also, I can see that all FOLLOWERs distribute the requests they
receive to the LEADER as is expected. But for some reason the FOLLOWERS are
not getting /replication requests from the LEADER.  So the collection for
the leader contains many thousand of documents and is on the 8th
generation. I see that it's replicable in the admin interface, yet all
FOLLOWER nodes have an empty index.

Hence, I need your insights please.

Thanks,

Marc

To Note:

When I startup my nodes I see the following error in solr.log:
1) When Zookeeper does a clusterstate update, all nodes have their starte
"DOWN", why? This I means that in the Solr Admin interface they show up has
down. This never updates to active.

2) I have a warning :  org.apache.solr.rest.ManagedResource; No registered
observers for /rest/managed, which I need to update solrconfig.xml to fix

3) I have the following error:
ERROR - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; No
UpdateLog found - cannot sync

SOLR.LOG
-------------------------
[....]
INFO  - 2014-07-16 19:47:30.870;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
numShards=null message={
  "operation":"state",
  "state":"down",
  "base_url":"http://192.168.150.90:8983/solr",
  "core":"collection_name",
  "roles":null,
  "node_name":"192.168.150.90:8983_solr",
  "shard":null,
  "collection":"collection_name",
  "numShards":null,
  "core_node_name":null}
INFO  - 2014-07-16 19:47:30.871;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node1 is
already registered
[....]
WARN  - 2014-07-16 19:47:34.535; org.apache.solr.rest.ManagedResource; No
registered observers for /rest/managed
[....]
INFO  - 2014-07-16 19:48:25.135;
org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (2)
INFO  - 2014-07-16 19:48:25.287;
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
INFO  - 2014-07-16 19:48:25.291;
org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
ZooKeeper...
INFO  - 2014-07-16 19:48:25.293;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
numShards=null message={
  "operation":"state",
  "state":"down",
  "base_url":"http://192.168.200.90:8983/solr",
  "core":"collection_name",
  "roles":null,
  "node_name":"192.168.200.90:8983_solr",
  "shard":null,
  "collection":"collection_name",
  "numShards":null,
  "core_node_name":null}
INFO  - 2014-07-16 19:48:25.293;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node2 is
already registered
INFO  - 2014-07-16 19:48:25.293;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is already
registered
[....]
INFO  - 2014-07-16 19:49:00.188;
org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (3)
INFO  - 2014-07-16 19:49:00.322;
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
INFO  - 2014-07-16 19:49:00.335;
org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
ZooKeeper...
INFO  - 2014-07-16 19:49:00.337;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
numShards=null message={
  "operation":"state",
  "state":"down",
  "base_url":"http://192.168.200.91:8983/solr",
  "core":"collection_name",
  "roles":null,
  "node_name":"192.168.200.91:8983_solr",
  "shard":null,
  "collection":"collection_name",
  "numShards":null,
  "core_node_name":null}
INFO  - 2014-07-16 19:49:00.337;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node3 is
already registered
INFO  - 2014-07-16 19:49:00.337;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is already
registered
[....]
INFO  - 2014-07-16 19:49:21.220;
org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (4)
INFO  - 2014-07-16 19:49:21.350;
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
INFO  - 2014-07-16 19:49:21.357;
org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
ZooKeeper...
INFO  - 2014-07-16 19:49:21.359;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
numShards=null message={
  "operation":"state",
  "state":"down",
  "base_url":"http://192.168.150.91:8983/solr",
  "core":"collection_name",
  "roles":null,
  "node_name":"192.168.150.91:8983_solr",
  "shard":null,
  "collection":"collection_name",
  "numShards":null,
  "core_node_name":null}
INFO  - 2014-07-16 19:49:21.359;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node4 is
already registered
INFO  - 2014-07-16 19:49:21.359;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is already
registered
[....]
INFO  - 2014-07-16 19:49:25.336;
org.apache.solr.cloud.ShardLeaderElectionContext; Enough replicas found to
continue.
INFO  - 2014-07-16 19:49:25.336;
org.apache.solr.cloud.ShardLeaderElectionContext; I may be the new leader -
try and sync
INFO  - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; Sync
replicas to http://192.168.150.90:8983/solr/pde_frontend/
ERROR - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; No
UpdateLog found - cannot sync
INFO  - 2014-07-16 19:49:25.337;
org.apache.solr.cloud.ShardLeaderElectionContext; We failed sync, but we
have no versions - we can't sync in that case - we were active before, so
become leader anyway
INFO  - 2014-07-16 19:49:25.337;
org.apache.solr.cloud.ShardLeaderElectionContext; I am the new leader:
http://192.168.150.90:8983/solr/pde_frontend/ shard1
INFO  - 2014-07-16 19:49:25.338; org.apache.solr.common.cloud.SolrZkClient;
makePath: /collections/pde_frontend/leaders/shard1
INFO  - 2014-07-16 19:49:25.360;
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
INFO  - 2014-07-16 19:49:25.364;
org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
ZooKeeper...
INFO  - 2014-07-16 19:49:25.376;
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
INFO  - 2014-07-16 19:49:25.381;
org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change:
WatchedEvent state:SyncConnected type:NodeDataChanged
path:/clusterstate.json, has occurred - updating... (live nodes size: 4)
INFO  - 2014-07-16 19:49:25.413; org.apache.solr.cloud.ZkController; We are
http://192.168.150.90:8983/solr/pde_frontend/ and leader is
http://192.168.150.90:8983/solr/pde_frontend/
INFO  - 2014-07-16 19:49:25.414;
org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
ZooKeeper...
INFO  - 2014-07-16 19:49:28.848; org.apache.solr.core.SolrCore;
[pde_frontend] webapp=/solr path=/admin/ping params={} hits=0 status=0
QTime=1
[....]
INFO  - 2014-07-16 19:53:18.177;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3339682]} 0 1
INFO  - 2014-07-16 19:53:18.213;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update/ params={indent=on&version=2.2&wt=json}
{add=[Listing_3335358]} 0 4
INFO  - 2014-07-16 19:53:18.242;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.200.90:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3338996]} 0 1
INFO  - 2014-07-16 19:53:18.258; org.apache.solr.core.SolrDeletionPolicy;
SolrDeletionPolicy.onCommit: commits: num=2
commit{dir=NRTCachingDirectory(MMapDirectory@/opt/solr_collections/collection_name/data/index
lockFactory=NativeFSLockFactory@/opt/solr_collections/collection_name/data/index;
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_1,generation=1}
commit{dir=NRTCachingDirectory(MMapDirectory@/opt/solr_collections/collection_name/data/index
lockFactory=NativeFSLockFactory@/opt/solr_collections/collection_name/data/index;
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_2,generation=2}
INFO  - 2014-07-16 19:53:18.258; org.apache.solr.core.SolrDeletionPolicy;
newest commit generation = 2
INFO  - 2014-07-16 19:53:18.296;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.150.91:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3332883]} 0 6
INFO  - 2014-07-16 19:53:18.372;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3328261]} 0 1
INFO  - 2014-07-16 19:53:18.389; org.apache.solr.search.SolrIndexSearcher;
Opening Searcher@7daec974[collection_name] realtime
INFO  - 2014-07-16 19:53:18.390;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update/ params={indent=on&version=2.2&wt=json}
{add=[Listing_3328952]} 0 3
INFO  - 2014-07-16 19:53:18.392;
org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
INFO  - 2014-07-16 19:53:18.407;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.200.90:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3121200]} 0 0
INFO  - 2014-07-16 19:53:18.437;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.150.91:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3502558]} 0 1
INFO  - 2014-07-16 19:53:18.459;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3502564]} 0 1

Re: Shard Replicas not getting replicated data from leader

Posted by Erick Erickson <er...@gmail.com>.
Right, so did that fix your problem?


On Thu, Jul 17, 2014 at 7:10 AM, Marc Campeau <ca...@gmail.com> wrote:

> Turns out updateLog had been disabled in solrconfig.xml
>
> Marc
>
>
> 2014-07-16 16:44 GMT-04:00 Marc Campeau <ca...@gmail.com>:
>
> > Hi,
> >
> > I have setup 4 Solr (4.9.0) Nodes into a single shard for a given
> > collection, meaning I should have 4 replicated nodes. I have 3 Zookeepers
> > in ensemble managing the configs for this collection. I have a load
> > balancer in front of the 4 nodes to split traffic between them.
> >
> > I start this collection with an empty data/index directory.
> >
> > When I send /update requests to the load balancers I see these going to
> > all 4 nodes. Also, I can see that all FOLLOWERs distribute the requests
> > they receive to the LEADER as is expected. But for some reason the
> > FOLLOWERS are not getting /replication requests from the LEADER.  So the
> > collection for the leader contains many thousand of documents and is on
> the
> > 8th generation. I see that it's replicable in the admin interface, yet
> all
> > FOLLOWER nodes have an empty index.
> >
> > Hence, I need your insights please.
> >
> > Thanks,
> >
> > Marc
> >
> > To Note:
> >
> > When I startup my nodes I see the following error in solr.log:
> > 1) When Zookeeper does a clusterstate update, all nodes have their starte
> > "DOWN", why? This I means that in the Solr Admin interface they show up
> has
> > down. This never updates to active.
> >
> > 2) I have a warning :  org.apache.solr.rest.ManagedResource; No
> registered
> > observers for /rest/managed, which I need to update solrconfig.xml to fix
> >
> > 3) I have the following error:
> > ERROR - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; No
> > UpdateLog found - cannot sync
> >
> > SOLR.LOG
> > -------------------------
> > [....]
> > INFO  - 2014-07-16 19:47:30.870;
> > org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
> > numShards=null message={
> >   "operation":"state",
> >   "state":"down",
> >   "base_url":"http://192.168.150.90:8983/solr",
> >   "core":"collection_name",
> >   "roles":null,
> >   "node_name":"192.168.150.90:8983_solr",
> >   "shard":null,
> >   "collection":"collection_name",
> >   "numShards":null,
> >   "core_node_name":null}
> > INFO  - 2014-07-16 19:47:30.871;
> > org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node1 is
> > already registered
> > [....]
> > WARN  - 2014-07-16 19:47:34.535; org.apache.solr.rest.ManagedResource; No
> > registered observers for /rest/managed
> > [....]
> > INFO  - 2014-07-16 19:48:25.135;
> > org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (2)
> > INFO  - 2014-07-16 19:48:25.287;
> > org.apache.solr.cloud.DistributedQueue$LatchChildWatcher;
> LatchChildWatcher
> > fired on path: /overseer/queue state: SyncConnected type
> NodeChildrenChanged
> > INFO  - 2014-07-16 19:48:25.291;
> > org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
> > ZooKeeper...
> > INFO  - 2014-07-16 19:48:25.293;
> > org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
> > numShards=null message={
> >   "operation":"state",
> >   "state":"down",
> >   "base_url":"http://192.168.200.90:8983/solr",
> >   "core":"collection_name",
> >   "roles":null,
> >   "node_name":"192.168.200.90:8983_solr",
> >   "shard":null,
> >   "collection":"collection_name",
> >   "numShards":null,
> >   "core_node_name":null}
> > INFO  - 2014-07-16 19:48:25.293;
> > org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node2 is
> > already registered
> > INFO  - 2014-07-16 19:48:25.293;
> > org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is
> already
> > registered
> > [....]
> > INFO  - 2014-07-16 19:49:00.188;
> > org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (3)
> > INFO  - 2014-07-16 19:49:00.322;
> > org.apache.solr.cloud.DistributedQueue$LatchChildWatcher;
> LatchChildWatcher
> > fired on path: /overseer/queue state: SyncConnected type
> NodeChildrenChanged
> > INFO  - 2014-07-16 19:49:00.335;
> > org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
> > ZooKeeper...
> > INFO  - 2014-07-16 19:49:00.337;
> > org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
> > numShards=null message={
> >   "operation":"state",
> >   "state":"down",
> >   "base_url":"http://192.168.200.91:8983/solr",
> >   "core":"collection_name",
> >   "roles":null,
> >   "node_name":"192.168.200.91:8983_solr",
> >   "shard":null,
> >   "collection":"collection_name",
> >   "numShards":null,
> >   "core_node_name":null}
> > INFO  - 2014-07-16 19:49:00.337;
> > org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node3 is
> > already registered
> > INFO  - 2014-07-16 19:49:00.337;
> > org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is
> already
> > registered
> > [....]
> > INFO  - 2014-07-16 19:49:21.220;
> > org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (4)
> > INFO  - 2014-07-16 19:49:21.350;
> > org.apache.solr.cloud.DistributedQueue$LatchChildWatcher;
> LatchChildWatcher
> > fired on path: /overseer/queue state: SyncConnected type
> NodeChildrenChanged
> > INFO  - 2014-07-16 19:49:21.357;
> > org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
> > ZooKeeper...
> > INFO  - 2014-07-16 19:49:21.359;
> > org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
> > numShards=null message={
> >   "operation":"state",
> >   "state":"down",
> >   "base_url":"http://192.168.150.91:8983/solr",
> >   "core":"collection_name",
> >   "roles":null,
> >   "node_name":"192.168.150.91:8983_solr",
> >   "shard":null,
> >   "collection":"collection_name",
> >   "numShards":null,
> >   "core_node_name":null}
> > INFO  - 2014-07-16 19:49:21.359;
> > org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node4 is
> > already registered
> > INFO  - 2014-07-16 19:49:21.359;
> > org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is
> already
> > registered
> > [....]
> > INFO  - 2014-07-16 19:49:25.336;
> > org.apache.solr.cloud.ShardLeaderElectionContext; Enough replicas found
> to
> > continue.
> > INFO  - 2014-07-16 19:49:25.336;
> > org.apache.solr.cloud.ShardLeaderElectionContext; I may be the new
> leader -
> > try and sync
> > INFO  - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; Sync
> > replicas to http://192.168.150.90:8983/solr/pde_frontend/
> > ERROR - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; No
> > UpdateLog found - cannot sync
> > INFO  - 2014-07-16 19:49:25.337;
> > org.apache.solr.cloud.ShardLeaderElectionContext; We failed sync, but we
> > have no versions - we can't sync in that case - we were active before, so
> > become leader anyway
> > INFO  - 2014-07-16 19:49:25.337;
> > org.apache.solr.cloud.ShardLeaderElectionContext; I am the new leader:
> > http://192.168.150.90:8983/solr/pde_frontend/ shard1
> > INFO  - 2014-07-16 19:49:25.338;
> > org.apache.solr.common.cloud.SolrZkClient; makePath:
> > /collections/pde_frontend/leaders/shard1
> > INFO  - 2014-07-16 19:49:25.360;
> > org.apache.solr.cloud.DistributedQueue$LatchChildWatcher;
> LatchChildWatcher
> > fired on path: /overseer/queue state: SyncConnected type
> NodeChildrenChanged
> > INFO  - 2014-07-16 19:49:25.364;
> > org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
> > ZooKeeper...
> > INFO  - 2014-07-16 19:49:25.376;
> > org.apache.solr.cloud.DistributedQueue$LatchChildWatcher;
> LatchChildWatcher
> > fired on path: /overseer/queue state: SyncConnected type
> NodeChildrenChanged
> > INFO  - 2014-07-16 19:49:25.381;
> > org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change:
> > WatchedEvent state:SyncConnected type:NodeDataChanged
> > path:/clusterstate.json, has occurred - updating... (live nodes size: 4)
> > INFO  - 2014-07-16 19:49:25.413; org.apache.solr.cloud.ZkController; We
> > are http://192.168.150.90:8983/solr/pde_frontend/ and leader is
> > http://192.168.150.90:8983/solr/pde_frontend/
> > INFO  - 2014-07-16 19:49:25.414;
> > org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
> > ZooKeeper...
> > INFO  - 2014-07-16 19:49:28.848; org.apache.solr.core.SolrCore;
> > [pde_frontend] webapp=/solr path=/admin/ping params={} hits=0 status=0
> > QTime=1
> > [....]
> > INFO  - 2014-07-16 19:53:18.177;
> > org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> > webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> > http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
> > {add=[Listing_3339682]} 0 1
> > INFO  - 2014-07-16 19:53:18.213;
> > org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> > webapp=/solr path=/update/ params={indent=on&version=2.2&wt=json}
> > {add=[Listing_3335358]} 0 4
> > INFO  - 2014-07-16 19:53:18.242;
> > org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> > webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> > http://192.168.200.90:8983/solr/collection_name/&wt=javabin&version=2}
> > {add=[Listing_3338996]} 0 1
> > INFO  - 2014-07-16 19:53:18.258; org.apache.solr.core.SolrDeletionPolicy;
> > SolrDeletionPolicy.onCommit: commits: num=2
> > commit{dir=NRTCachingDirectory(MMapDirectory@
> /opt/solr_collections/collection_name/data/index
> > lockFactory=NativeFSLockFactory@
> /opt/solr_collections/collection_name/data/index;
> > maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_1,generation=1}
> >  commit{dir=NRTCachingDirectory(MMapDirectory@
> /opt/solr_collections/collection_name/data/index
> > lockFactory=NativeFSLockFactory@
> /opt/solr_collections/collection_name/data/index;
> > maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_2,generation=2}
> > INFO  - 2014-07-16 19:53:18.258; org.apache.solr.core.SolrDeletionPolicy;
> > newest commit generation = 2
> > INFO  - 2014-07-16 19:53:18.296;
> > org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> > webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> > http://192.168.150.91:8983/solr/collection_name/&wt=javabin&version=2}
> > {add=[Listing_3332883]} 0 6
> > INFO  - 2014-07-16 19:53:18.372;
> > org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> > webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> > http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
> > {add=[Listing_3328261]} 0 1
> > INFO  - 2014-07-16 19:53:18.389;
> org.apache.solr.search.SolrIndexSearcher;
> > Opening Searcher@7daec974[collection_name] realtime
> > INFO  - 2014-07-16 19:53:18.390;
> > org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> > webapp=/solr path=/update/ params={indent=on&version=2.2&wt=json}
> > {add=[Listing_3328952]} 0 3
> > INFO  - 2014-07-16 19:53:18.392;
> > org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
> > INFO  - 2014-07-16 19:53:18.407;
> > org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> > webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> > http://192.168.200.90:8983/solr/collection_name/&wt=javabin&version=2}
> > {add=[Listing_3121200]} 0 0
> > INFO  - 2014-07-16 19:53:18.437;
> > org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> > webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> > http://192.168.150.91:8983/solr/collection_name/&wt=javabin&version=2}
> > {add=[Listing_3502558]} 0 1
> > INFO  - 2014-07-16 19:53:18.459;
> > org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> > webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> > http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
> > {add=[Listing_3502564]} 0 1
> >
> >
>

Re: Shard Replicas not getting replicated data from leader

Posted by Marc Campeau <ca...@gmail.com>.
Turns out updateLog had been disabled in solrconfig.xml

Marc


2014-07-16 16:44 GMT-04:00 Marc Campeau <ca...@gmail.com>:

> Hi,
>
> I have setup 4 Solr (4.9.0) Nodes into a single shard for a given
> collection, meaning I should have 4 replicated nodes. I have 3 Zookeepers
> in ensemble managing the configs for this collection. I have a load
> balancer in front of the 4 nodes to split traffic between them.
>
> I start this collection with an empty data/index directory.
>
> When I send /update requests to the load balancers I see these going to
> all 4 nodes. Also, I can see that all FOLLOWERs distribute the requests
> they receive to the LEADER as is expected. But for some reason the
> FOLLOWERS are not getting /replication requests from the LEADER.  So the
> collection for the leader contains many thousand of documents and is on the
> 8th generation. I see that it's replicable in the admin interface, yet all
> FOLLOWER nodes have an empty index.
>
> Hence, I need your insights please.
>
> Thanks,
>
> Marc
>
> To Note:
>
> When I startup my nodes I see the following error in solr.log:
> 1) When Zookeeper does a clusterstate update, all nodes have their starte
> "DOWN", why? This I means that in the Solr Admin interface they show up has
> down. This never updates to active.
>
> 2) I have a warning :  org.apache.solr.rest.ManagedResource; No registered
> observers for /rest/managed, which I need to update solrconfig.xml to fix
>
> 3) I have the following error:
> ERROR - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; No
> UpdateLog found - cannot sync
>
> SOLR.LOG
> -------------------------
> [....]
> INFO  - 2014-07-16 19:47:30.870;
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
> numShards=null message={
>   "operation":"state",
>   "state":"down",
>   "base_url":"http://192.168.150.90:8983/solr",
>   "core":"collection_name",
>   "roles":null,
>   "node_name":"192.168.150.90:8983_solr",
>   "shard":null,
>   "collection":"collection_name",
>   "numShards":null,
>   "core_node_name":null}
> INFO  - 2014-07-16 19:47:30.871;
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node1 is
> already registered
> [....]
> WARN  - 2014-07-16 19:47:34.535; org.apache.solr.rest.ManagedResource; No
> registered observers for /rest/managed
> [....]
> INFO  - 2014-07-16 19:48:25.135;
> org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (2)
> INFO  - 2014-07-16 19:48:25.287;
> org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
> fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
> INFO  - 2014-07-16 19:48:25.291;
> org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
> ZooKeeper...
> INFO  - 2014-07-16 19:48:25.293;
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
> numShards=null message={
>   "operation":"state",
>   "state":"down",
>   "base_url":"http://192.168.200.90:8983/solr",
>   "core":"collection_name",
>   "roles":null,
>   "node_name":"192.168.200.90:8983_solr",
>   "shard":null,
>   "collection":"collection_name",
>   "numShards":null,
>   "core_node_name":null}
> INFO  - 2014-07-16 19:48:25.293;
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node2 is
> already registered
> INFO  - 2014-07-16 19:48:25.293;
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is already
> registered
> [....]
> INFO  - 2014-07-16 19:49:00.188;
> org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (3)
> INFO  - 2014-07-16 19:49:00.322;
> org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
> fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
> INFO  - 2014-07-16 19:49:00.335;
> org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
> ZooKeeper...
> INFO  - 2014-07-16 19:49:00.337;
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
> numShards=null message={
>   "operation":"state",
>   "state":"down",
>   "base_url":"http://192.168.200.91:8983/solr",
>   "core":"collection_name",
>   "roles":null,
>   "node_name":"192.168.200.91:8983_solr",
>   "shard":null,
>   "collection":"collection_name",
>   "numShards":null,
>   "core_node_name":null}
> INFO  - 2014-07-16 19:49:00.337;
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node3 is
> already registered
> INFO  - 2014-07-16 19:49:00.337;
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is already
> registered
> [....]
> INFO  - 2014-07-16 19:49:21.220;
> org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (4)
> INFO  - 2014-07-16 19:49:21.350;
> org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
> fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
> INFO  - 2014-07-16 19:49:21.357;
> org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
> ZooKeeper...
> INFO  - 2014-07-16 19:49:21.359;
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
> numShards=null message={
>   "operation":"state",
>   "state":"down",
>   "base_url":"http://192.168.150.91:8983/solr",
>   "core":"collection_name",
>   "roles":null,
>   "node_name":"192.168.150.91:8983_solr",
>   "shard":null,
>   "collection":"collection_name",
>   "numShards":null,
>   "core_node_name":null}
> INFO  - 2014-07-16 19:49:21.359;
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node4 is
> already registered
> INFO  - 2014-07-16 19:49:21.359;
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is already
> registered
> [....]
> INFO  - 2014-07-16 19:49:25.336;
> org.apache.solr.cloud.ShardLeaderElectionContext; Enough replicas found to
> continue.
> INFO  - 2014-07-16 19:49:25.336;
> org.apache.solr.cloud.ShardLeaderElectionContext; I may be the new leader -
> try and sync
> INFO  - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; Sync
> replicas to http://192.168.150.90:8983/solr/pde_frontend/
> ERROR - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; No
> UpdateLog found - cannot sync
> INFO  - 2014-07-16 19:49:25.337;
> org.apache.solr.cloud.ShardLeaderElectionContext; We failed sync, but we
> have no versions - we can't sync in that case - we were active before, so
> become leader anyway
> INFO  - 2014-07-16 19:49:25.337;
> org.apache.solr.cloud.ShardLeaderElectionContext; I am the new leader:
> http://192.168.150.90:8983/solr/pde_frontend/ shard1
> INFO  - 2014-07-16 19:49:25.338;
> org.apache.solr.common.cloud.SolrZkClient; makePath:
> /collections/pde_frontend/leaders/shard1
> INFO  - 2014-07-16 19:49:25.360;
> org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
> fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
> INFO  - 2014-07-16 19:49:25.364;
> org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
> ZooKeeper...
> INFO  - 2014-07-16 19:49:25.376;
> org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
> fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
> INFO  - 2014-07-16 19:49:25.381;
> org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change:
> WatchedEvent state:SyncConnected type:NodeDataChanged
> path:/clusterstate.json, has occurred - updating... (live nodes size: 4)
> INFO  - 2014-07-16 19:49:25.413; org.apache.solr.cloud.ZkController; We
> are http://192.168.150.90:8983/solr/pde_frontend/ and leader is
> http://192.168.150.90:8983/solr/pde_frontend/
> INFO  - 2014-07-16 19:49:25.414;
> org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
> ZooKeeper...
> INFO  - 2014-07-16 19:49:28.848; org.apache.solr.core.SolrCore;
> [pde_frontend] webapp=/solr path=/admin/ping params={} hits=0 status=0
> QTime=1
> [....]
> INFO  - 2014-07-16 19:53:18.177;
> org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
> {add=[Listing_3339682]} 0 1
> INFO  - 2014-07-16 19:53:18.213;
> org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> webapp=/solr path=/update/ params={indent=on&version=2.2&wt=json}
> {add=[Listing_3335358]} 0 4
> INFO  - 2014-07-16 19:53:18.242;
> org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> http://192.168.200.90:8983/solr/collection_name/&wt=javabin&version=2}
> {add=[Listing_3338996]} 0 1
> INFO  - 2014-07-16 19:53:18.258; org.apache.solr.core.SolrDeletionPolicy;
> SolrDeletionPolicy.onCommit: commits: num=2
> commit{dir=NRTCachingDirectory(MMapDirectory@/opt/solr_collections/collection_name/data/index
> lockFactory=NativeFSLockFactory@/opt/solr_collections/collection_name/data/index;
> maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_1,generation=1}
>  commit{dir=NRTCachingDirectory(MMapDirectory@/opt/solr_collections/collection_name/data/index
> lockFactory=NativeFSLockFactory@/opt/solr_collections/collection_name/data/index;
> maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_2,generation=2}
> INFO  - 2014-07-16 19:53:18.258; org.apache.solr.core.SolrDeletionPolicy;
> newest commit generation = 2
> INFO  - 2014-07-16 19:53:18.296;
> org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> http://192.168.150.91:8983/solr/collection_name/&wt=javabin&version=2}
> {add=[Listing_3332883]} 0 6
> INFO  - 2014-07-16 19:53:18.372;
> org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
> {add=[Listing_3328261]} 0 1
> INFO  - 2014-07-16 19:53:18.389; org.apache.solr.search.SolrIndexSearcher;
> Opening Searcher@7daec974[collection_name] realtime
> INFO  - 2014-07-16 19:53:18.390;
> org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> webapp=/solr path=/update/ params={indent=on&version=2.2&wt=json}
> {add=[Listing_3328952]} 0 3
> INFO  - 2014-07-16 19:53:18.392;
> org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
> INFO  - 2014-07-16 19:53:18.407;
> org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> http://192.168.200.90:8983/solr/collection_name/&wt=javabin&version=2}
> {add=[Listing_3121200]} 0 0
> INFO  - 2014-07-16 19:53:18.437;
> org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> http://192.168.150.91:8983/solr/collection_name/&wt=javabin&version=2}
> {add=[Listing_3502558]} 0 1
> INFO  - 2014-07-16 19:53:18.459;
> org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
> webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
> http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
> {add=[Listing_3502564]} 0 1
>
>