You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sebastián Ramírez <se...@senseta.com> on 2013/05/20 22:21:11 UTC

Replica shards not updating their index when update is sent to them

Hello,

I'm having a little problem with a test SolrCloud cluster.

I've set up 3 nodes (SolrCores) to use an external Zookeeper. I use 1 shard
and the other 2 SolrCores are being auto-asigned as replicas.

Let's say I have these 3 nodes: the leader shard A, the replica shard B,
and the (other) replica shard C.

I can send queries to any node (A, B or C) and I get the results.

I can send updates to the leader shard (A) and get correct (updated)
results in any of the 3 shards (A, B, or C).

* Here is the problem:
When I send an update to a non-leader (replica) shard (B), the updated
results are reflected in the leader shard (A) and in the other replica
shard (C), but not in the shard that received the update (B). I can do this
same process, send the update to the other non-leader shard (C), and the
same happens, I get the results in the leader (A) and in the other replica
shard (B), but not in the shard that received the update (C).

Any suggestion?

Thanks!

Sebastián Ramírez

-- 
*----------------------------------------------------*
*This e-mail transmission, including any attachments, is intended only for 
the named recipient(s) and may contain information that is privileged, 
confidential and/or exempt from disclosure under applicable law. If you 
have received this transmission in error, or are not the named 
recipient(s), please notify Senseta immediately by return e-mail and 
permanently delete this transmission, including any attachments.*

Re: Replica shards not updating their index when update is sent to them

Posted by Sebastián Ramírez <se...@senseta.com>.
I found how to solve the problem.

After sending a file to be indexed to a replica shard (node2):

curl 'http://node2:8983/solr/update?commit=true' -H 'Content-type:
text/xml' --data-binary '<add><doc><field name="id">asdf</field><field
name="content">big moth</field></doc></add>'

I can send a "commit" param to the same shard and then it gets updated:

curl 'http://node2:8983/solr/update?commit=true'


Another option is to send, from the beginning, a "commitWithin" param with
some milliseconds instead of a "commit" directly. That way, the commit
happens at most (the milliseconds specified) after, but the changes get
reflected in all shards, including the replica shard that received the
update request:

curl 'http://node2:8983/solr/update?commitWithin=10000<http://node2:8983/solr/update?commit=true>
'


As these emails get archived, I hope this may help someone in the future.

Sebastián Ramírez


On Mon, May 20, 2013 at 4:32 PM, Sebastián Ramírez <
sebastian.ramirez@senseta.com> wrote:

> Yes, It's happening with the latest version, 4.2.1
>
> Yes, it's easy to reproduce.
> It happened using 3 Virtual Machines and also happened using 3 physical
> nodes.
>
>
> Here are the details:
>
> I installed Hortonworks (a Hadoop distribution) in the 3 nodes. That
> installs Zookeeper.
>
> I used the "example" directory and copied it to the 3 nodes.
>
> I start Zookeeper in the 3 nodes.
>
> The first time, I run this command on each node, to start Solr:  java
> -jar -Dbootstrap_conf=true -DzkHost='node1,node2,node3'  start.jar
>
> As I understand, the "-Dbootstrap_conf=true" uploads the configuration to
> Zookeeper, so I don't need to do that the following times that I start each
> SolrCore.
>
> So, the following times, I run this on each node: java -jar
> -DzkHost='node0,node1,node2' start.jar
>
> Because I ran that command on node0 first, that node became the leader
> shard.
>
> I send an update to the leader shard, (in this case node0):
> I run curl 'http://node0:8983/solr/update?commit=true' -H 'Content-type:
> text/xml' --data-binary '<add><doc><field name="id">asdf</field><field
> name="content">buggy</field></doc></add>'
>
> When I query any shard I get the correct result:
> I run curl 'http://node0:8983/solr/select?q=id:asdf'
> or curl 'http://node1:8983/solr/select?q=id:asdf'
> or curl 'http://node2:8983/solr/select?q=id:asdf'
> (i.e. I send the query to each node), and then I get the expected response ...
> <doc><str name="id">asdf</str><arr name="content"> <str>buggy</str> </arr>
> ... </doc>...
>
> But when I send an update to a replica shard (node2) it is updated only in
> the leader shard (node0) and in the other replica (node1), not in the shard
> that received the update (node2):
> I send an update to the replica node2,
> I run curl 'http://node2:8983/solr/update?commit=true' -H 'Content-type:
> text/xml' --data-binary '<add><doc><field name="id">asdf</field><field
> name="content">big moth</field></doc></add>'
>
> Then I query each node and I receive the updated results only from the
> leader shard (node0) and the other replica shard (node1).
>
> I run (leader, node0):
> curl 'http://node0:8983/solr/select?q=id:asdf'
> And I get:
> ... <doc><str name="id">asdf</str><arr name="content"> <str>big moth</str>
> </arr> ... </doc> ...
>
> I run (other replica, node1):
> curl 'http://node1:8983/solr/select?q=id:asdf'
> And I get:
> ... <doc><str name="id">asdf</str><arr name="content"> <str>big moth</str>
> </arr> ... </doc> ...
>
> I run (first replica, the one that received the update, node2):
> curl 'http://node2:8983/solr/select?q=id:asdf'
> And I get (old result):
> ... <doc><str name="id">asdf</str><arr name="content"> <str>buggy</str>
> </arr> ... </doc> ...
>
> Thanks for your interest,
>
> Sebastián Ramírez
>
>
> On Mon, May 20, 2013 at 3:30 PM, Yonik Seeley <yo...@lucidworks.com>wrote:
>
>> On Mon, May 20, 2013 at 4:21 PM, Sebastián Ramírez
>> <se...@senseta.com> wrote:
>> > When I send an update to a non-leader (replica) shard (B), the updated
>> > results are reflected in the leader shard (A) and in the other replica
>> > shard (C), but not in the shard that received the update (B).
>>
>> I've never seen that before.  The replica that received the update
>> isn't treated as special in any way by the code, so it's not clear how
>> this could happen.
>>
>> What version of Solr is this (and does it happen with the latest
>> version)?  How easy is this to reproduce for you?
>>
>> -Yonik
>> http://lucidworks.com
>>
>
>

-- 
*----------------------------------------------------*
*This e-mail transmission, including any attachments, is intended only for 
the named recipient(s) and may contain information that is privileged, 
confidential and/or exempt from disclosure under applicable law. If you 
have received this transmission in error, or are not the named 
recipient(s), please notify Senseta immediately by return e-mail and 
permanently delete this transmission, including any attachments.*

Re: Replica shards not updating their index when update is sent to them

Posted by Sebastián Ramírez <se...@senseta.com>.
Yes, It's happening with the latest version, 4.2.1

Yes, it's easy to reproduce.
It happened using 3 Virtual Machines and also happened using 3 physical
nodes.


Here are the details:

I installed Hortonworks (a Hadoop distribution) in the 3 nodes. That
installs Zookeeper.

I used the "example" directory and copied it to the 3 nodes.

I start Zookeeper in the 3 nodes.

The first time, I run this command on each node, to start Solr:  java -jar
-Dbootstrap_conf=true -DzkHost='node1,node2,node3'  start.jar

As I understand, the "-Dbootstrap_conf=true" uploads the configuration to
Zookeeper, so I don't need to do that the following times that I start each
SolrCore.

So, the following times, I run this on each node: java -jar
-DzkHost='node0,node1,node2' start.jar

Because I ran that command on node0 first, that node became the leader
shard.

I send an update to the leader shard, (in this case node0):
I run curl 'http://node0:8983/solr/update?commit=true' -H 'Content-type:
text/xml' --data-binary '<add><doc><field name="id">asdf</field><field
name="content">buggy</field></doc></add>'

When I query any shard I get the correct result:
I run curl 'http://node0:8983/solr/select?q=id:asdf'
or curl 'http://node1:8983/solr/select?q=id:asdf'
or curl 'http://node2:8983/solr/select?q=id:asdf'
(i.e. I send the query to each node), and then I get the expected response ...
<doc><str name="id">asdf</str><arr name="content"> <str>buggy</str> </arr>
... </doc>...

But when I send an update to a replica shard (node2) it is updated only in
the leader shard (node0) and in the other replica (node1), not in the shard
that received the update (node2):
I send an update to the replica node2,
I run curl 'http://node2:8983/solr/update?commit=true' -H 'Content-type:
text/xml' --data-binary '<add><doc><field name="id">asdf</field><field
name="content">big moth</field></doc></add>'

Then I query each node and I receive the updated results only from the
leader shard (node0) and the other replica shard (node1).

I run (leader, node0):
curl 'http://node0:8983/solr/select?q=id:asdf'
And I get:
... <doc><str name="id">asdf</str><arr name="content"> <str>big moth</str>
</arr> ... </doc> ...

I run (other replica, node1):
curl 'http://node1:8983/solr/select?q=id:asdf'
And I get:
... <doc><str name="id">asdf</str><arr name="content"> <str>big moth</str>
</arr> ... </doc> ...

I run (first replica, the one that received the update, node2):
curl 'http://node2:8983/solr/select?q=id:asdf'
And I get (old result):
... <doc><str name="id">asdf</str><arr name="content"> <str>buggy</str>
</arr> ... </doc> ...

Thanks for your interest,

Sebastián Ramírez


On Mon, May 20, 2013 at 3:30 PM, Yonik Seeley <yo...@lucidworks.com> wrote:

> On Mon, May 20, 2013 at 4:21 PM, Sebastián Ramírez
> <se...@senseta.com> wrote:
> > When I send an update to a non-leader (replica) shard (B), the updated
> > results are reflected in the leader shard (A) and in the other replica
> > shard (C), but not in the shard that received the update (B).
>
> I've never seen that before.  The replica that received the update
> isn't treated as special in any way by the code, so it's not clear how
> this could happen.
>
> What version of Solr is this (and does it happen with the latest
> version)?  How easy is this to reproduce for you?
>
> -Yonik
> http://lucidworks.com
>

-- 
*----------------------------------------------------*
*This e-mail transmission, including any attachments, is intended only for 
the named recipient(s) and may contain information that is privileged, 
confidential and/or exempt from disclosure under applicable law. If you 
have received this transmission in error, or are not the named 
recipient(s), please notify Senseta immediately by return e-mail and 
permanently delete this transmission, including any attachments.*

Re: Replica shards not updating their index when update is sent to them

Posted by Yonik Seeley <yo...@lucidworks.com>.
On Mon, May 20, 2013 at 4:21 PM, Sebastián Ramírez
<se...@senseta.com> wrote:
> When I send an update to a non-leader (replica) shard (B), the updated
> results are reflected in the leader shard (A) and in the other replica
> shard (C), but not in the shard that received the update (B).

I've never seen that before.  The replica that received the update
isn't treated as special in any way by the code, so it's not clear how
this could happen.

What version of Solr is this (and does it happen with the latest
version)?  How easy is this to reproduce for you?

-Yonik
http://lucidworks.com