You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Timothy Potter <th...@gmail.com> on 2013/11/19 18:14:09 UTC

Option to enforce a majority quorum approach to accepting updates in SolrCloud?

I've been thinking about how SolrCloud deals with write-availability using
in-sync replica sets, in which writes will continue to be accepted so long
as there is at least one healthy node per shard.

For a little background (and to verify my understanding of the process is
correct), SolrCloud only considers active/healthy replicas when
acknowledging a write. Specifically, when a shard leader accepts an update
request, it forwards the request to all active/healthy replicas and only
considers the write successful if all active/healthy replicas ack the
write. Any down / gone replicas are not considered and will sync up with
the leader when they come back online using peer sync or snapshot
replication. For instance, if a shard has 3 nodes, A, B, C with A being the
current leader, then writes to the shard will continue to succeed even if B
& C are down.

The issue is that if a shard leader continues to accept updates even if it
loses all of its replicas, then we have acknowledged updates on only 1
node. If that node, call it A, then fails and one of the previous replicas,
call it B, comes back online before A does, then any writes that A accepted
while the other replicas were offline are at risk to being lost.

SolrCloud does provide a safe-guard mechanism for this problem with the
leaderVoteWait setting, which puts any replicas that come back online
before node A into a temporary wait state. If A comes back online within
the wait period, then all is well as it will become the leader again and no
writes will be lost. As a side note, sys admins definitely need to be made
more aware of this situation as when I first encountered it in my cluster,
I had no idea what it meant.

My question is whether we want to consider an approach where SolrCloud will
not accept writes unless there is a majority of replicas available to
accept the write? For my example, under this approach, we wouldn't accept
writes if both B&C failed, but would if only C did, leaving A & B online.
Admittedly, this lowers the write-availability of the system, so may be
something that should be tunable? Just wanted to put this out there as
something I've been thinking about lately ...

Cheers,
Tim

Re: Option to enforce a majority quorum approach to accepting updates in SolrCloud?

Posted by Timothy Potter <th...@gmail.com>.
Hi Otis,

I think these are related problems but giving the ability to enforce a
majority quorum among the total replica set for a shard is not the
same as hinted handoff in the Cassandra sense. Cass's hinted handed
allows you to say it's ok to send the write somewhere and somehow
it'll make its way to the correct node, eventually. But, from the docs
"A hinted write does not count towards ConsistencyLevel requirements
of ONE, QUORUM, or ALL." It's useful for environments that need
extreme write-availability.

Today, SolrCloud uses a consistency level of ALL, where ALL is based
on the number of *active* replicas for a shard, which could be as
small as one. What I'm proposing is to give a knob to allow a
SolrCloud user to enforce a consistency level of QUORUM, where QUORUM
is based on the entire replica set (down + active replicas) and not
just active replicas as it is today. However, we'll need a better
vocabulary for this because in my scenario, QUORUM is stronger than
ALL which will confuse even the most seasoned of distributed systems
engineers ;-)

Cheers,
Tim


On Tue, Nov 19, 2013 at 9:25 PM, Otis Gospodnetic
<ot...@gmail.com> wrote:
> Btw. isn't the situation Timothy is describing what hinted handoff is all
> about?
>
> http://wiki.apache.org/cassandra/HintedHandoff
> http://www.datastax.com/dev/blog/modern-hinted-handoff
>
> Check this:
> http://www.jroller.com/otis/entry/common_distributed_computing_routines
>
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Tue, Nov 19, 2013 at 1:58 PM, Mark Miller <ma...@gmail.com> wrote:
>
>> Mostly a lot of other systems already offer these types of things, so they
>> were hard not to think about while building :) Just hard to get back to a
>> lot of those things, even though a lot of them are fairly low hanging
>> fruit. Hardening takes the priority :(
>>
>> - Mark
>>
>> On Nov 19, 2013, at 12:42 PM, Timothy Potter <th...@gmail.com> wrote:
>>
>> > You're thinking is always one-step ahead of me! I'll file the JIRA
>> >
>> > Thanks.
>> > Tim
>> >
>> >
>> > On Tue, Nov 19, 2013 at 10:38 AM, Mark Miller <ma...@gmail.com>
>> wrote:
>> >
>> >> Yeah, this is kind of like one of many little features that we have just
>> >> not gotten to yet. I’ve always planned for a param that let’s you say
>> how
>> >> many replicas an update must be verified on before responding success.
>> >> Seems to make sense to fail that type of request early if you notice
>> there
>> >> are not enough replicas up to satisfy the param to begin with.
>> >>
>> >> I don’t think there is a JIRA issue yet, fire away if you want.
>> >>
>> >> - Mark
>> >>
>> >> On Nov 19, 2013, at 12:14 PM, Timothy Potter <th...@gmail.com>
>> wrote:
>> >>
>> >>> I've been thinking about how SolrCloud deals with write-availability
>> >> using
>> >>> in-sync replica sets, in which writes will continue to be accepted so
>> >> long
>> >>> as there is at least one healthy node per shard.
>> >>>
>> >>> For a little background (and to verify my understanding of the process
>> is
>> >>> correct), SolrCloud only considers active/healthy replicas when
>> >>> acknowledging a write. Specifically, when a shard leader accepts an
>> >> update
>> >>> request, it forwards the request to all active/healthy replicas and
>> only
>> >>> considers the write successful if all active/healthy replicas ack the
>> >>> write. Any down / gone replicas are not considered and will sync up
>> with
>> >>> the leader when they come back online using peer sync or snapshot
>> >>> replication. For instance, if a shard has 3 nodes, A, B, C with A being
>> >> the
>> >>> current leader, then writes to the shard will continue to succeed even
>> >> if B
>> >>> & C are down.
>> >>>
>> >>> The issue is that if a shard leader continues to accept updates even if
>> >> it
>> >>> loses all of its replicas, then we have acknowledged updates on only 1
>> >>> node. If that node, call it A, then fails and one of the previous
>> >> replicas,
>> >>> call it B, comes back online before A does, then any writes that A
>> >> accepted
>> >>> while the other replicas were offline are at risk to being lost.
>> >>>
>> >>> SolrCloud does provide a safe-guard mechanism for this problem with the
>> >>> leaderVoteWait setting, which puts any replicas that come back online
>> >>> before node A into a temporary wait state. If A comes back online
>> within
>> >>> the wait period, then all is well as it will become the leader again
>> and
>> >> no
>> >>> writes will be lost. As a side note, sys admins definitely need to be
>> >> made
>> >>> more aware of this situation as when I first encountered it in my
>> >> cluster,
>> >>> I had no idea what it meant.
>> >>>
>> >>> My question is whether we want to consider an approach where SolrCloud
>> >> will
>> >>> not accept writes unless there is a majority of replicas available to
>> >>> accept the write? For my example, under this approach, we wouldn't
>> accept
>> >>> writes if both B&C failed, but would if only C did, leaving A & B
>> online.
>> >>> Admittedly, this lowers the write-availability of the system, so may be
>> >>> something that should be tunable? Just wanted to put this out there as
>> >>> something I've been thinking about lately ...
>> >>>
>> >>> Cheers,
>> >>> Tim
>> >>
>> >>
>>
>>

Re: Option to enforce a majority quorum approach to accepting updates in SolrCloud?

Posted by Otis Gospodnetic <ot...@gmail.com>.
Btw. isn't the situation Timothy is describing what hinted handoff is all
about?

http://wiki.apache.org/cassandra/HintedHandoff
http://www.datastax.com/dev/blog/modern-hinted-handoff

Check this:
http://www.jroller.com/otis/entry/common_distributed_computing_routines

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Tue, Nov 19, 2013 at 1:58 PM, Mark Miller <ma...@gmail.com> wrote:

> Mostly a lot of other systems already offer these types of things, so they
> were hard not to think about while building :) Just hard to get back to a
> lot of those things, even though a lot of them are fairly low hanging
> fruit. Hardening takes the priority :(
>
> - Mark
>
> On Nov 19, 2013, at 12:42 PM, Timothy Potter <th...@gmail.com> wrote:
>
> > You're thinking is always one-step ahead of me! I'll file the JIRA
> >
> > Thanks.
> > Tim
> >
> >
> > On Tue, Nov 19, 2013 at 10:38 AM, Mark Miller <ma...@gmail.com>
> wrote:
> >
> >> Yeah, this is kind of like one of many little features that we have just
> >> not gotten to yet. I’ve always planned for a param that let’s you say
> how
> >> many replicas an update must be verified on before responding success.
> >> Seems to make sense to fail that type of request early if you notice
> there
> >> are not enough replicas up to satisfy the param to begin with.
> >>
> >> I don’t think there is a JIRA issue yet, fire away if you want.
> >>
> >> - Mark
> >>
> >> On Nov 19, 2013, at 12:14 PM, Timothy Potter <th...@gmail.com>
> wrote:
> >>
> >>> I've been thinking about how SolrCloud deals with write-availability
> >> using
> >>> in-sync replica sets, in which writes will continue to be accepted so
> >> long
> >>> as there is at least one healthy node per shard.
> >>>
> >>> For a little background (and to verify my understanding of the process
> is
> >>> correct), SolrCloud only considers active/healthy replicas when
> >>> acknowledging a write. Specifically, when a shard leader accepts an
> >> update
> >>> request, it forwards the request to all active/healthy replicas and
> only
> >>> considers the write successful if all active/healthy replicas ack the
> >>> write. Any down / gone replicas are not considered and will sync up
> with
> >>> the leader when they come back online using peer sync or snapshot
> >>> replication. For instance, if a shard has 3 nodes, A, B, C with A being
> >> the
> >>> current leader, then writes to the shard will continue to succeed even
> >> if B
> >>> & C are down.
> >>>
> >>> The issue is that if a shard leader continues to accept updates even if
> >> it
> >>> loses all of its replicas, then we have acknowledged updates on only 1
> >>> node. If that node, call it A, then fails and one of the previous
> >> replicas,
> >>> call it B, comes back online before A does, then any writes that A
> >> accepted
> >>> while the other replicas were offline are at risk to being lost.
> >>>
> >>> SolrCloud does provide a safe-guard mechanism for this problem with the
> >>> leaderVoteWait setting, which puts any replicas that come back online
> >>> before node A into a temporary wait state. If A comes back online
> within
> >>> the wait period, then all is well as it will become the leader again
> and
> >> no
> >>> writes will be lost. As a side note, sys admins definitely need to be
> >> made
> >>> more aware of this situation as when I first encountered it in my
> >> cluster,
> >>> I had no idea what it meant.
> >>>
> >>> My question is whether we want to consider an approach where SolrCloud
> >> will
> >>> not accept writes unless there is a majority of replicas available to
> >>> accept the write? For my example, under this approach, we wouldn't
> accept
> >>> writes if both B&C failed, but would if only C did, leaving A & B
> online.
> >>> Admittedly, this lowers the write-availability of the system, so may be
> >>> something that should be tunable? Just wanted to put this out there as
> >>> something I've been thinking about lately ...
> >>>
> >>> Cheers,
> >>> Tim
> >>
> >>
>
>

Re: Option to enforce a majority quorum approach to accepting updates in SolrCloud?

Posted by Timothy Potter <th...@gmail.com>.
I got to thinking about this particular question while watching this
presentation, which is well worth 45 minutes if you can spare it:

http://www.infoq.com/presentations/partitioning-comparison

I created SOLR-5468 for this.


On Tue, Nov 19, 2013 at 11:58 AM, Mark Miller <ma...@gmail.com> wrote:

> Mostly a lot of other systems already offer these types of things, so they
> were hard not to think about while building :) Just hard to get back to a
> lot of those things, even though a lot of them are fairly low hanging
> fruit. Hardening takes the priority :(
>
> - Mark
>
> On Nov 19, 2013, at 12:42 PM, Timothy Potter <th...@gmail.com> wrote:
>
> > You're thinking is always one-step ahead of me! I'll file the JIRA
> >
> > Thanks.
> > Tim
> >
> >
> > On Tue, Nov 19, 2013 at 10:38 AM, Mark Miller <ma...@gmail.com>
> wrote:
> >
> >> Yeah, this is kind of like one of many little features that we have just
> >> not gotten to yet. I’ve always planned for a param that let’s you say
> how
> >> many replicas an update must be verified on before responding success.
> >> Seems to make sense to fail that type of request early if you notice
> there
> >> are not enough replicas up to satisfy the param to begin with.
> >>
> >> I don’t think there is a JIRA issue yet, fire away if you want.
> >>
> >> - Mark
> >>
> >> On Nov 19, 2013, at 12:14 PM, Timothy Potter <th...@gmail.com>
> wrote:
> >>
> >>> I've been thinking about how SolrCloud deals with write-availability
> >> using
> >>> in-sync replica sets, in which writes will continue to be accepted so
> >> long
> >>> as there is at least one healthy node per shard.
> >>>
> >>> For a little background (and to verify my understanding of the process
> is
> >>> correct), SolrCloud only considers active/healthy replicas when
> >>> acknowledging a write. Specifically, when a shard leader accepts an
> >> update
> >>> request, it forwards the request to all active/healthy replicas and
> only
> >>> considers the write successful if all active/healthy replicas ack the
> >>> write. Any down / gone replicas are not considered and will sync up
> with
> >>> the leader when they come back online using peer sync or snapshot
> >>> replication. For instance, if a shard has 3 nodes, A, B, C with A being
> >> the
> >>> current leader, then writes to the shard will continue to succeed even
> >> if B
> >>> & C are down.
> >>>
> >>> The issue is that if a shard leader continues to accept updates even if
> >> it
> >>> loses all of its replicas, then we have acknowledged updates on only 1
> >>> node. If that node, call it A, then fails and one of the previous
> >> replicas,
> >>> call it B, comes back online before A does, then any writes that A
> >> accepted
> >>> while the other replicas were offline are at risk to being lost.
> >>>
> >>> SolrCloud does provide a safe-guard mechanism for this problem with the
> >>> leaderVoteWait setting, which puts any replicas that come back online
> >>> before node A into a temporary wait state. If A comes back online
> within
> >>> the wait period, then all is well as it will become the leader again
> and
> >> no
> >>> writes will be lost. As a side note, sys admins definitely need to be
> >> made
> >>> more aware of this situation as when I first encountered it in my
> >> cluster,
> >>> I had no idea what it meant.
> >>>
> >>> My question is whether we want to consider an approach where SolrCloud
> >> will
> >>> not accept writes unless there is a majority of replicas available to
> >>> accept the write? For my example, under this approach, we wouldn't
> accept
> >>> writes if both B&C failed, but would if only C did, leaving A & B
> online.
> >>> Admittedly, this lowers the write-availability of the system, so may be
> >>> something that should be tunable? Just wanted to put this out there as
> >>> something I've been thinking about lately ...
> >>>
> >>> Cheers,
> >>> Tim
> >>
> >>
>
>

Re: Option to enforce a majority quorum approach to accepting updates in SolrCloud?

Posted by Mark Miller <ma...@gmail.com>.
Mostly a lot of other systems already offer these types of things, so they were hard not to think about while building :) Just hard to get back to a lot of those things, even though a lot of them are fairly low hanging fruit. Hardening takes the priority :(

- Mark

On Nov 19, 2013, at 12:42 PM, Timothy Potter <th...@gmail.com> wrote:

> You're thinking is always one-step ahead of me! I'll file the JIRA
> 
> Thanks.
> Tim
> 
> 
> On Tue, Nov 19, 2013 at 10:38 AM, Mark Miller <ma...@gmail.com> wrote:
> 
>> Yeah, this is kind of like one of many little features that we have just
>> not gotten to yet. I’ve always planned for a param that let’s you say how
>> many replicas an update must be verified on before responding success.
>> Seems to make sense to fail that type of request early if you notice there
>> are not enough replicas up to satisfy the param to begin with.
>> 
>> I don’t think there is a JIRA issue yet, fire away if you want.
>> 
>> - Mark
>> 
>> On Nov 19, 2013, at 12:14 PM, Timothy Potter <th...@gmail.com> wrote:
>> 
>>> I've been thinking about how SolrCloud deals with write-availability
>> using
>>> in-sync replica sets, in which writes will continue to be accepted so
>> long
>>> as there is at least one healthy node per shard.
>>> 
>>> For a little background (and to verify my understanding of the process is
>>> correct), SolrCloud only considers active/healthy replicas when
>>> acknowledging a write. Specifically, when a shard leader accepts an
>> update
>>> request, it forwards the request to all active/healthy replicas and only
>>> considers the write successful if all active/healthy replicas ack the
>>> write. Any down / gone replicas are not considered and will sync up with
>>> the leader when they come back online using peer sync or snapshot
>>> replication. For instance, if a shard has 3 nodes, A, B, C with A being
>> the
>>> current leader, then writes to the shard will continue to succeed even
>> if B
>>> & C are down.
>>> 
>>> The issue is that if a shard leader continues to accept updates even if
>> it
>>> loses all of its replicas, then we have acknowledged updates on only 1
>>> node. If that node, call it A, then fails and one of the previous
>> replicas,
>>> call it B, comes back online before A does, then any writes that A
>> accepted
>>> while the other replicas were offline are at risk to being lost.
>>> 
>>> SolrCloud does provide a safe-guard mechanism for this problem with the
>>> leaderVoteWait setting, which puts any replicas that come back online
>>> before node A into a temporary wait state. If A comes back online within
>>> the wait period, then all is well as it will become the leader again and
>> no
>>> writes will be lost. As a side note, sys admins definitely need to be
>> made
>>> more aware of this situation as when I first encountered it in my
>> cluster,
>>> I had no idea what it meant.
>>> 
>>> My question is whether we want to consider an approach where SolrCloud
>> will
>>> not accept writes unless there is a majority of replicas available to
>>> accept the write? For my example, under this approach, we wouldn't accept
>>> writes if both B&C failed, but would if only C did, leaving A & B online.
>>> Admittedly, this lowers the write-availability of the system, so may be
>>> something that should be tunable? Just wanted to put this out there as
>>> something I've been thinking about lately ...
>>> 
>>> Cheers,
>>> Tim
>> 
>> 


Re: Option to enforce a majority quorum approach to accepting updates in SolrCloud?

Posted by Timothy Potter <th...@gmail.com>.
You're thinking is always one-step ahead of me! I'll file the JIRA

Thanks.
Tim


On Tue, Nov 19, 2013 at 10:38 AM, Mark Miller <ma...@gmail.com> wrote:

> Yeah, this is kind of like one of many little features that we have just
> not gotten to yet. I’ve always planned for a param that let’s you say how
> many replicas an update must be verified on before responding success.
> Seems to make sense to fail that type of request early if you notice there
> are not enough replicas up to satisfy the param to begin with.
>
> I don’t think there is a JIRA issue yet, fire away if you want.
>
> - Mark
>
> On Nov 19, 2013, at 12:14 PM, Timothy Potter <th...@gmail.com> wrote:
>
> > I've been thinking about how SolrCloud deals with write-availability
> using
> > in-sync replica sets, in which writes will continue to be accepted so
> long
> > as there is at least one healthy node per shard.
> >
> > For a little background (and to verify my understanding of the process is
> > correct), SolrCloud only considers active/healthy replicas when
> > acknowledging a write. Specifically, when a shard leader accepts an
> update
> > request, it forwards the request to all active/healthy replicas and only
> > considers the write successful if all active/healthy replicas ack the
> > write. Any down / gone replicas are not considered and will sync up with
> > the leader when they come back online using peer sync or snapshot
> > replication. For instance, if a shard has 3 nodes, A, B, C with A being
> the
> > current leader, then writes to the shard will continue to succeed even
> if B
> > & C are down.
> >
> > The issue is that if a shard leader continues to accept updates even if
> it
> > loses all of its replicas, then we have acknowledged updates on only 1
> > node. If that node, call it A, then fails and one of the previous
> replicas,
> > call it B, comes back online before A does, then any writes that A
> accepted
> > while the other replicas were offline are at risk to being lost.
> >
> > SolrCloud does provide a safe-guard mechanism for this problem with the
> > leaderVoteWait setting, which puts any replicas that come back online
> > before node A into a temporary wait state. If A comes back online within
> > the wait period, then all is well as it will become the leader again and
> no
> > writes will be lost. As a side note, sys admins definitely need to be
> made
> > more aware of this situation as when I first encountered it in my
> cluster,
> > I had no idea what it meant.
> >
> > My question is whether we want to consider an approach where SolrCloud
> will
> > not accept writes unless there is a majority of replicas available to
> > accept the write? For my example, under this approach, we wouldn't accept
> > writes if both B&C failed, but would if only C did, leaving A & B online.
> > Admittedly, this lowers the write-availability of the system, so may be
> > something that should be tunable? Just wanted to put this out there as
> > something I've been thinking about lately ...
> >
> > Cheers,
> > Tim
>
>

Re: Option to enforce a majority quorum approach to accepting updates in SolrCloud?

Posted by Mark Miller <ma...@gmail.com>.
Yeah, this is kind of like one of many little features that we have just not gotten to yet. I’ve always planned for a param that let’s you say how many replicas an update must be verified on before responding success. Seems to make sense to fail that type of request early if you notice there are not enough replicas up to satisfy the param to begin with.

I don’t think there is a JIRA issue yet, fire away if you want.

- Mark

On Nov 19, 2013, at 12:14 PM, Timothy Potter <th...@gmail.com> wrote:

> I've been thinking about how SolrCloud deals with write-availability using
> in-sync replica sets, in which writes will continue to be accepted so long
> as there is at least one healthy node per shard.
> 
> For a little background (and to verify my understanding of the process is
> correct), SolrCloud only considers active/healthy replicas when
> acknowledging a write. Specifically, when a shard leader accepts an update
> request, it forwards the request to all active/healthy replicas and only
> considers the write successful if all active/healthy replicas ack the
> write. Any down / gone replicas are not considered and will sync up with
> the leader when they come back online using peer sync or snapshot
> replication. For instance, if a shard has 3 nodes, A, B, C with A being the
> current leader, then writes to the shard will continue to succeed even if B
> & C are down.
> 
> The issue is that if a shard leader continues to accept updates even if it
> loses all of its replicas, then we have acknowledged updates on only 1
> node. If that node, call it A, then fails and one of the previous replicas,
> call it B, comes back online before A does, then any writes that A accepted
> while the other replicas were offline are at risk to being lost.
> 
> SolrCloud does provide a safe-guard mechanism for this problem with the
> leaderVoteWait setting, which puts any replicas that come back online
> before node A into a temporary wait state. If A comes back online within
> the wait period, then all is well as it will become the leader again and no
> writes will be lost. As a side note, sys admins definitely need to be made
> more aware of this situation as when I first encountered it in my cluster,
> I had no idea what it meant.
> 
> My question is whether we want to consider an approach where SolrCloud will
> not accept writes unless there is a majority of replicas available to
> accept the write? For my example, under this approach, we wouldn't accept
> writes if both B&C failed, but would if only C did, leaving A & B online.
> Admittedly, this lowers the write-availability of the system, so may be
> something that should be tunable? Just wanted to put this out there as
> something I've been thinking about lately ...
> 
> Cheers,
> Tim