You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by David Harvey <sy...@gmail.com> on 2018/08/23 18:58:02 UTC

affinityBackupFilter for AWS Availability Zones

I need an affinityBackupFilter that will prevent backups from running in
the same AWS availability zone.  (A single availability zone has the
characteristic that some or all of the EC2 instances in that zone can fail
together due to a single fault.   You have no control over the hosts on
which the EC2 instance VMs run on in AWS, except by controlling the
availability zone) .

I could write a few lines of custom code, but then I have to get it
deployed on all nodes in the cluster, and peer class loading will not
work.   So I cannot use an of the shelf docker image, for example.   So
that code should just be part of Ignite.

I was thinking of adding new class along these lines, where the apply
function will return true only if none of the node's attributes match those
of any of the nodes in the list.   This would become part of the code base,
but would only be used if configured as the backupAffinityFunction

ClusterNodeNoAttributesMatchBiPredicate implements
IgniteBiPredicate<ClusterNode,

List<ClusterNode>> {


    ClusterNodeNoAttributesMatchBiPredicate(String[] attributeNames)
    {....}

For AvailabilityZones, there would be only one attribute examined, but we
have some potential use cases for distributing backups across two
sub-groups of an AZ.

Alternately, we could enhance the RendezvousAffinityFunction to allow one
or more arbitrary attributes to be compared  to determine neighbors,
rather  than only org.apache.ignite.macs, and to add a setting that
controls whether backups should be placed on neighbors if they can't be
placed anywhere else.

If I have 2 backups and three availability zones (AZ), I want one copy of
the data in each AZ.  If all nodes in one AZ fail, I want to be able to
decide to try to get to three copies anyway, increasing the per node
footprint by 50%, or to only run with one backup.     This would also give
be a convoluted way to change  the number of backups of a cache
dynamically:    Start the cache with a large number of backups, but don't
provide a location where the backup would be allowed to run initially.

Re: affinityBackupFilter for AWS Availability Zones

Posted by David Harvey <sy...@gmail.com>.
Yes, thanks Val!

On Mon, Sep 24, 2018 at 11:35 AM Dmitriy Pavlov <dp...@gmail.com>
wrote:

> Hi Val, many thanks for the review.
>
> ср, 12 сент. 2018 г. в 20:35, Valentin Kulichenko <
> valentin.kulichenko@gmail.com>:
>
> > Yes, will try to review this week.
> >
> > -Val
> >
> > On Wed, Sep 12, 2018 at 10:24 AM Dmitriy Pavlov <dp...@gmail.com>
> > wrote:
> >
> > > Hi Val,
> > >
> > > I'm not an expert in AWS, so could you please pick up the review?
> > >
> > > Thank you in advance!
> > >
> > > Sincerely,
> > > Dmitriy Pavlov
> > >
> > > вт, 11 сент. 2018 г. в 1:28, Dave Harvey <dh...@jobcase.com>:
> > >
> > > > Submitted a patch for this
> > > > https://issues.apache.org/jira/browse/IGNITE-9365
> > > >
> > > >
> > > >
> > > > --
> > > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > >
> > >
> >
>

Re: affinityBackupFilter for AWS Availability Zones

Posted by Dmitriy Pavlov <dp...@gmail.com>.
Hi Val, many thanks for the review.

ср, 12 сент. 2018 г. в 20:35, Valentin Kulichenko <
valentin.kulichenko@gmail.com>:

> Yes, will try to review this week.
>
> -Val
>
> On Wed, Sep 12, 2018 at 10:24 AM Dmitriy Pavlov <dp...@gmail.com>
> wrote:
>
> > Hi Val,
> >
> > I'm not an expert in AWS, so could you please pick up the review?
> >
> > Thank you in advance!
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> > вт, 11 сент. 2018 г. в 1:28, Dave Harvey <dh...@jobcase.com>:
> >
> > > Submitted a patch for this
> > > https://issues.apache.org/jira/browse/IGNITE-9365
> > >
> > >
> > >
> > > --
> > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > >
> >
>

Re: affinityBackupFilter for AWS Availability Zones

Posted by Valentin Kulichenko <va...@gmail.com>.
Yes, will try to review this week.

-Val

On Wed, Sep 12, 2018 at 10:24 AM Dmitriy Pavlov <dp...@gmail.com>
wrote:

> Hi Val,
>
> I'm not an expert in AWS, so could you please pick up the review?
>
> Thank you in advance!
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 11 сент. 2018 г. в 1:28, Dave Harvey <dh...@jobcase.com>:
>
> > Submitted a patch for this
> > https://issues.apache.org/jira/browse/IGNITE-9365
> >
> >
> >
> > --
> > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >
>

Re: affinityBackupFilter for AWS Availability Zones

Posted by Dmitriy Pavlov <dp...@gmail.com>.
Hi Val,

I'm not an expert in AWS, so could you please pick up the review?

Thank you in advance!

Sincerely,
Dmitriy Pavlov

вт, 11 сент. 2018 г. в 1:28, Dave Harvey <dh...@jobcase.com>:

> Submitted a patch for this
> https://issues.apache.org/jira/browse/IGNITE-9365
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>

Re: affinityBackupFilter for AWS Availability Zones

Posted by Dave Harvey <dh...@jobcase.com>.
Submitted a patch for this https://issues.apache.org/jira/browse/IGNITE-9365



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

Re: affinityBackupFilter for AWS Availability Zones

Posted by David Harvey <sy...@gmail.com>.
Added IGNITE-9365

On Thu, Aug 23, 2018 at 3:56 PM Valentin Kulichenko <
valentin.kulichenko@gmail.com> wrote:

> Hi David,
>
> With the Docker image you can actually use additional libraries by
> providing URLs to JARs via EXTERNAL_LIBS property. Please refer to this
> page: https://apacheignite.readme.io/docs/docker-deployment
>
> But anyway, I believe that such contribution might be very valuable for
> Ignite. Feel free to create a ticket.
>
> -Val
>
> On Thu, Aug 23, 2018 at 11:58 AM David Harvey <sy...@gmail.com>
> wrote:
>
> > I need an affinityBackupFilter that will prevent backups from running in
> > the same AWS availability zone.  (A single availability zone has the
> > characteristic that some or all of the EC2 instances in that zone can
> fail
> > together due to a single fault.   You have no control over the hosts on
> > which the EC2 instance VMs run on in AWS, except by controlling the
> > availability zone) .
> >
> > I could write a few lines of custom code, but then I have to get it
> > deployed on all nodes in the cluster, and peer class loading will not
> > work.   So I cannot use an of the shelf docker image, for example.   So
> > that code should just be part of Ignite.
> >
> > I was thinking of adding new class along these lines, where the apply
> > function will return true only if none of the node's attributes match
> those
> > of any of the nodes in the list.   This would become part of the code
> base,
> > but would only be used if configured as the backupAffinityFunction
> >
> > ClusterNodeNoAttributesMatchBiPredicate implements
> > IgniteBiPredicate<ClusterNode,
> >
> > List<ClusterNode>> {
> >
> >
> >     ClusterNodeNoAttributesMatchBiPredicate(String[] attributeNames)
> >     {....}
> >
> > For AvailabilityZones, there would be only one attribute examined, but we
> > have some potential use cases for distributing backups across two
> > sub-groups of an AZ.
> >
> > Alternately, we could enhance the RendezvousAffinityFunction to allow one
> > or more arbitrary attributes to be compared  to determine neighbors,
> > rather  than only org.apache.ignite.macs, and to add a setting that
> > controls whether backups should be placed on neighbors if they can't be
> > placed anywhere else.
> >
> > If I have 2 backups and three availability zones (AZ), I want one copy of
> > the data in each AZ.  If all nodes in one AZ fail, I want to be able to
> > decide to try to get to three copies anyway, increasing the per node
> > footprint by 50%, or to only run with one backup.     This would also
> give
> > be a convoluted way to change  the number of backups of a cache
> > dynamically:    Start the cache with a large number of backups, but don't
> > provide a location where the backup would be allowed to run initially.
> >
>

Re: affinityBackupFilter for AWS Availability Zones

Posted by Valentin Kulichenko <va...@gmail.com>.
Hi David,

With the Docker image you can actually use additional libraries by
providing URLs to JARs via EXTERNAL_LIBS property. Please refer to this
page: https://apacheignite.readme.io/docs/docker-deployment

But anyway, I believe that such contribution might be very valuable for
Ignite. Feel free to create a ticket.

-Val

On Thu, Aug 23, 2018 at 11:58 AM David Harvey <sy...@gmail.com> wrote:

> I need an affinityBackupFilter that will prevent backups from running in
> the same AWS availability zone.  (A single availability zone has the
> characteristic that some or all of the EC2 instances in that zone can fail
> together due to a single fault.   You have no control over the hosts on
> which the EC2 instance VMs run on in AWS, except by controlling the
> availability zone) .
>
> I could write a few lines of custom code, but then I have to get it
> deployed on all nodes in the cluster, and peer class loading will not
> work.   So I cannot use an of the shelf docker image, for example.   So
> that code should just be part of Ignite.
>
> I was thinking of adding new class along these lines, where the apply
> function will return true only if none of the node's attributes match those
> of any of the nodes in the list.   This would become part of the code base,
> but would only be used if configured as the backupAffinityFunction
>
> ClusterNodeNoAttributesMatchBiPredicate implements
> IgniteBiPredicate<ClusterNode,
>
> List<ClusterNode>> {
>
>
>     ClusterNodeNoAttributesMatchBiPredicate(String[] attributeNames)
>     {....}
>
> For AvailabilityZones, there would be only one attribute examined, but we
> have some potential use cases for distributing backups across two
> sub-groups of an AZ.
>
> Alternately, we could enhance the RendezvousAffinityFunction to allow one
> or more arbitrary attributes to be compared  to determine neighbors,
> rather  than only org.apache.ignite.macs, and to add a setting that
> controls whether backups should be placed on neighbors if they can't be
> placed anywhere else.
>
> If I have 2 backups and three availability zones (AZ), I want one copy of
> the data in each AZ.  If all nodes in one AZ fail, I want to be able to
> decide to try to get to three copies anyway, increasing the per node
> footprint by 50%, or to only run with one backup.     This would also give
> be a convoluted way to change  the number of backups of a cache
> dynamically:    Start the cache with a large number of backups, but don't
> provide a location where the backup would be allowed to run initially.
>