You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Shai Erera <se...@gmail.com> on 2015/03/24 12:59:48 UTC

How to verify a document is indexed by all replicas

Hi

Is there a recommended, preferably fast, way to check that a document is
indexed by all replicas? I currently do that by issuing a search request to
each replica, but was wondering if there's a faster way.

Even better, is there a way to verify all replicas of a shard are
"up-to-date", e.g. by comparing their version or something? By "up-to-date"
I mean that they've all processed the same update requests that came
through.

If there's a replica lagging behind, I'd like to wait for it to catch up,
something like a checkpoint(), before I continue sending more updates.

Shai

Re: How to verify a document is indexed by all replicas

Posted by Shai Erera <se...@gmail.com>.

>
> You can add a min_rf=true parameter to your indexing
>

Yeah I read about it, but it doesn't help me as in this case, I'm
implementing some monitoring component over a SolrCloud instance, so I have
no handle to the indexing client. I would like the monitor to check the
replicas and report something if all replicas are in sync, some are not in
sync, or e.g. replicas 2 and 3 are further ahead than replica1.

Also, checking the state of the
> replica is not enough, one should always check for the state=active and
> live-ness of the replica i.e. the node is marked live under /live_nodes in
> ZK.
>

Thanks, I've looked at code samples in tests and saw this is done, so I
copied the logic. E.g. an .isReplicaAlive(Replica replica) checks both the
replica's state, as well that the node it's one is in the cluster state's
live nodes.

Also, verifying replicas are in sync via searching is not the best solution
at all. Apart from not being that fast, it also doesn't factor in documents
that are in the tlog, or in IW's RAM buffer, or even that a document may
have been updated. So I will change my test to ensuring that all replicas
of a slice are in state active (and on a live node) and rely on that being
OK.

Shai

On Tue, Mar 24, 2015 at 6:39 PM, Shalin Shekhar Mangar <
shalinmangar@gmail.com> wrote:

> Hi Shai,
>
> To your original question on how to know if a document has been indexed at
> all replicas -- You can add a min_rf=true parameter to your indexing
> request and then Solr will add information to the response about how many
> replicas gave an ack' to the leader. So if the returned number is equal to
> the number of replicas, you can be sure that the doc has been indexed
> everywhere.
>
> More comments inline:
>
> On Tue, Mar 24, 2015 at 8:18 AM, Shai Erera <se...@gmail.com> wrote:
>
> > Thanks Erick,
> >
> > When a replica is down, no updates are sent to it. When it comes back up,
> > it discovers that it needs to catch-up with the leader. If there are many
> > events it falls back to index replication (slower). During this period of
> > time, is the replica considered ACTIVE or RECOVERING?
> >
> >
> It is marked as recovering.
>
>
> > And, can I assume that at any given moment (aside from ZK connection
> > timeouts etc.) when I check the replicas' state, all the ones that report
> > ACTIVE are in sync with each other?
> >
> >
> Yes, 'active' replicas should be in sync but autoCommits can cause
> inconsistency between replicas as to what is visible to searchers (even if
> all replicas have indexed the same data). Also, checking the state of the
> replica is not enough, one should always check for the state=active and
> live-ness of the replica i.e. the node is marked live under /live_nodes in
> ZK.
>
>
> > Shai
> >
> > On Tue, Mar 24, 2015 at 5:04 PM, Erick Erickson <erickerickson@gmail.com
> >
> > wrote:
> >
> > > You can always issue a *:* query, but it'd have to be at least your
> > > autoSoftCommit interval ago since the soft commit trigger will have
> > > slightly different wall clock times.
> > >
> > > But it shouldn't be necessary to wait I don't think. Since the
> > > indexing request doesn't succeed until the docs have been written to
> > > the tlogs, and since the tlogs will be replayed in the event of a
> > > problem your data should be fine. Of course if you're indexing at a
> > > very fast rate and your tlog is huge, it'll take a while....
> > >
> > > FWIW,
> > > Erick
> > >
> > > On Tue, Mar 24, 2015 at 4:59 AM, Shai Erera <se...@gmail.com> wrote:
> > > > Hi
> > > >
> > > > Is there a recommended, preferably fast, way to check that a document
> > is
> > > > indexed by all replicas? I currently do that by issuing a search
> > request
> > > to
> > > > each replica, but was wondering if there's a faster way.
> > > >
> > > > Even better, is there a way to verify all replicas of a shard are
> > > > "up-to-date", e.g. by comparing their version or something? By
> > > "up-to-date"
> > > > I mean that they've all processed the same update requests that came
> > > > through.
> > > >
> > > > If there's a replica lagging behind, I'd like to wait for it to catch
> > up,
> > > > something like a checkpoint(), before I continue sending more
> updates.
> > > >
> > > > Shai
> > >
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>

Re: How to verify a document is indexed by all replicas

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.

Hi Shai,

To your original question on how to know if a document has been indexed at
all replicas -- You can add a min_rf=true parameter to your indexing
request and then Solr will add information to the response about how many
replicas gave an ack' to the leader. So if the returned number is equal to
the number of replicas, you can be sure that the doc has been indexed
everywhere.

More comments inline:

On Tue, Mar 24, 2015 at 8:18 AM, Shai Erera <se...@gmail.com> wrote:

> Thanks Erick,
>
> When a replica is down, no updates are sent to it. When it comes back up,
> it discovers that it needs to catch-up with the leader. If there are many
> events it falls back to index replication (slower). During this period of
> time, is the replica considered ACTIVE or RECOVERING?
>
>
It is marked as recovering.


> And, can I assume that at any given moment (aside from ZK connection
> timeouts etc.) when I check the replicas' state, all the ones that report
> ACTIVE are in sync with each other?
>
>
Yes, 'active' replicas should be in sync but autoCommits can cause
inconsistency between replicas as to what is visible to searchers (even if
all replicas have indexed the same data). Also, checking the state of the
replica is not enough, one should always check for the state=active and
live-ness of the replica i.e. the node is marked live under /live_nodes in
ZK.


> Shai
>
> On Tue, Mar 24, 2015 at 5:04 PM, Erick Erickson <er...@gmail.com>
> wrote:
>
> > You can always issue a *:* query, but it'd have to be at least your
> > autoSoftCommit interval ago since the soft commit trigger will have
> > slightly different wall clock times.
> >
> > But it shouldn't be necessary to wait I don't think. Since the
> > indexing request doesn't succeed until the docs have been written to
> > the tlogs, and since the tlogs will be replayed in the event of a
> > problem your data should be fine. Of course if you're indexing at a
> > very fast rate and your tlog is huge, it'll take a while....
> >
> > FWIW,
> > Erick
> >
> > On Tue, Mar 24, 2015 at 4:59 AM, Shai Erera <se...@gmail.com> wrote:
> > > Hi
> > >
> > > Is there a recommended, preferably fast, way to check that a document
> is
> > > indexed by all replicas? I currently do that by issuing a search
> request
> > to
> > > each replica, but was wondering if there's a faster way.
> > >
> > > Even better, is there a way to verify all replicas of a shard are
> > > "up-to-date", e.g. by comparing their version or something? By
> > "up-to-date"
> > > I mean that they've all processed the same update requests that came
> > > through.
> > >
> > > If there's a replica lagging behind, I'd like to wait for it to catch
> up,
> > > something like a checkpoint(), before I continue sending more updates.
> > >
> > > Shai
> >
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: How to verify a document is indexed by all replicas

Posted by Shai Erera <se...@gmail.com>.

Thanks Erick,

When a replica is down, no updates are sent to it. When it comes back up,
it discovers that it needs to catch-up with the leader. If there are many
events it falls back to index replication (slower). During this period of
time, is the replica considered ACTIVE or RECOVERING?

And, can I assume that at any given moment (aside from ZK connection
timeouts etc.) when I check the replicas' state, all the ones that report
ACTIVE are in sync with each other?

Shai

On Tue, Mar 24, 2015 at 5:04 PM, Erick Erickson <er...@gmail.com>
wrote:

> You can always issue a *:* query, but it'd have to be at least your
> autoSoftCommit interval ago since the soft commit trigger will have
> slightly different wall clock times.
>
> But it shouldn't be necessary to wait I don't think. Since the
> indexing request doesn't succeed until the docs have been written to
> the tlogs, and since the tlogs will be replayed in the event of a
> problem your data should be fine. Of course if you're indexing at a
> very fast rate and your tlog is huge, it'll take a while....
>
> FWIW,
> Erick
>
> On Tue, Mar 24, 2015 at 4:59 AM, Shai Erera <se...@gmail.com> wrote:
> > Hi
> >
> > Is there a recommended, preferably fast, way to check that a document is
> > indexed by all replicas? I currently do that by issuing a search request
> to
> > each replica, but was wondering if there's a faster way.
> >
> > Even better, is there a way to verify all replicas of a shard are
> > "up-to-date", e.g. by comparing their version or something? By
> "up-to-date"
> > I mean that they've all processed the same update requests that came
> > through.
> >
> > If there's a replica lagging behind, I'd like to wait for it to catch up,
> > something like a checkpoint(), before I continue sending more updates.
> >
> > Shai
>

Re: How to verify a document is indexed by all replicas

Posted by Erick Erickson <er...@gmail.com>.

You can always issue a *:* query, but it'd have to be at least your
autoSoftCommit interval ago since the soft commit trigger will have
slightly different wall clock times.

But it shouldn't be necessary to wait I don't think. Since the
indexing request doesn't succeed until the docs have been written to
the tlogs, and since the tlogs will be replayed in the event of a
problem your data should be fine. Of course if you're indexing at a
very fast rate and your tlog is huge, it'll take a while....

FWIW,
Erick

On Tue, Mar 24, 2015 at 4:59 AM, Shai Erera <se...@gmail.com> wrote:
> Hi
>
> Is there a recommended, preferably fast, way to check that a document is
> indexed by all replicas? I currently do that by issuing a search request to
> each replica, but was wondering if there's a faster way.
>
> Even better, is there a way to verify all replicas of a shard are
> "up-to-date", e.g. by comparing their version or something? By "up-to-date"
> I mean that they've all processed the same update requests that came
> through.
>
> If there's a replica lagging behind, I'd like to wait for it to catch up,
> something like a checkpoint(), before I continue sending more updates.
>
> Shai