You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Alexey Goncharuk <al...@gmail.com> on 2020/11/19 09:46:22 UTC

IEP-61 Technical discussion

Following up the Ignite 3.0 scope/development approach threads, this is a
separate thread to discuss technical aspects of the IEP.

Let's reiterate one more time on the questions raised by Ivan and also see
if there are any other thoughts on the IEP:

   - *Whether to deploy metastorage on a separate subset of the nodes or
   allow Ignite to choose these nodes automatically.* I think it is
   feasible to maintain both modes: by default, Ignite will choose
   metastorage nodes automatically which essentially will provide the same
   seamless user experience as TCP discovery SPI - no separate roles,
   simplistic deployment. For deployments where people want to have more
   fine-grained control over the nodes' assignments, we will provide a runtime
   configuration which will allow pinning metastorage group to certain nodes,
   thus eliminating the latency concerns.
   - *Whether there are any TLA+ specs for the PacificA protocol.* Not to
   my knowledge, but it is known to be used in production by Microsoft and
   other projects, e.g. [1]

I would like to collect general feedback on the IEP, as well as feedback on
specific parts of it, such as:

   - Metastorage API
   - Any existing library that can be used to avoid re-implementing the
   protocol ourselves? Perhaps, porting the existing implementation to Java
   (the way TiKV did with etcd-raft [2] [3]? This is a very neat way btw in my
   opinion because I like the finite automata-like approach of the replication
   module, and, additionally, we could sync bug fixes and improvements from
   the upstream project)


Thanks,
--AG

[1] https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
[2] https://github.com/etcd-io/etcd/tree/master/raft
[3] https://github.com/tikv/raft-rs

Re: IEP-61 Technical discussion

Posted by Alexei Scherbakov <al...@gmail.com>.

Hi.

We have made some progress on the topic.

The JRaft fork is merged to Ignite 3 master, now it's integrated with other
ready components.

The design of transactional protocol in the first iteration is published on
the master [1]

[1] https://github.com/apache/ignite-3/tree/main/modules/transactions


сб, 20 мар. 2021 г. в 21:00, Alexei Scherbakov <alexey.scherbakoff@gmail.com
>:

> Folks,
>
> I want to share some information about progress in implementing the raft
> protocol in ignite 3, which is a prerequisite for metastorage.
>
> The implementation will consist of client and server modules. The client
> is responsible for interoperability between raft server node and any other
> remote/local java process
>
> I have recently finished a raft client API. The public API part is
> available here [1] for review. The entry point is RaftGroupService
> interface. The service implementation has not been finished yet and can be
> skipped for now.
>
> As for the server part, currently we are investigating two options. First
> is etcd [2] implementation ported to Java. The drawback here is the amount
> of work required to make it working. Second option is the adoption of
> jraft [3] implementation. It is a full featured implementation already
> written in Java, but the code is not quite clean in my opinion and will
> require some refactoring.
>
> The next step is to make a raft client working with server
> implementations. At least one is required for the next alpha. It is planned
> to have the same client for both server implementations. As soon as both
> will be ready, we will compare them by running consistency tests and
> benchmarks and drop the worst. I will give the next update when we will
> have a working client and at least one server implementation ready.
>
> [1] https://github.com/apache/ignite-3/pull/59/files
> [2] https://github.com/etcd-io/etcd/tree/master/raft
> [3] https://github.com/sofastack/sofa-jraft
>
> пт, 27 нояб. 2020 г. в 20:26, Alexey Goncharuk <alexey.goncharuk@gmail.com
> >:
>
>> Folks, thanks to everyone who joined the call. Summary:
>>
>>    - We agree that it may be beneficial to separate metastorage and group
>>    membership services, however, the abstractions should be clean enough
>> so
>>    that we could implement group membership via metastorage
>>    - Production cluster setup will involve an administrator 'init' command
>>    that will initialize the metastorage raft group. Once the metastorage
>> is
>>    initialized, all nodes may be restarted arbitrarily
>>    - HA cluster must contain at least 3 nodes. 2-node cluster will stop
>>    progress when one of the nodes fails (due to metastorage requirements)
>>    - We will provide a 'developer' cluster mode which will allow a 1-node
>>    setup and auto-initialization without the 'init' command
>>    - We are targeting centralized affinity calculation that will be stored
>>    to the metastorage. Metastorage downtime does not necessarily mean
>> cluster
>>    availability (subject to the partition replication protocol choice). It
>>    would be good to maximally hide the partition object so that we could
>>    support range partitioning in the future
>>
>> To discuss at the next meeting (do not hesitate to send questions here
>> before the meeting):
>>
>>    - Raft implementation details (API model, porting, etc)
>>    - Transactions interaction with replication protocol
>>    - Weaker consistency options
>>
>> Please add more if I forgot something and let's choose a time for the next
>> meeting.
>>
>> --AG
>>
>> чт, 26 нояб. 2020 г. в 16:12, Kseniya Romanova <romanova.ks.spb@gmail.com
>> >:
>>
>> > Done
>> >
>> > чт, 26 нояб. 2020 г. в 13:18, Ivan Daschinsky <iv...@gmail.com>:
>> >
>> > > Alexey, is it possible to manage call at 16:00 MSK?
>> > >
>> > > чт, 26 нояб. 2020 г. в 12:30, Alexey Goncharuk <
>> > alexey.goncharuk@gmail.com
>> > > >:
>> > >
>> > > > Hi Ivan,
>> > > >
>> > > > Unfortunately, the earliest window available for us is 12:00 MSK (1
>> > hour
>> > > > slot), or after 14:30 MSK. Let me know what time works best for you.
>> > > >
>> > > > ср, 25 нояб. 2020 г. в 21:38, Ivan Daschinsky <ivandasch@gmail.com
>> >:
>> > > >
>> > > > > Alexey, I kindly ask you to move the meeting a little bit earlier,
>> > > ideal
>> > > > > variant -- in the morning.
>> > > > >
>> > > > > ср, 25 нояб. 2020 г. в 20:10, Alexey Goncharuk <
>> > > > alexey.goncharuk@gmail.com
>> > > > > >:
>> > > > >
>> > > > > > Folks, let's have the call on Friday, Nov 27th at 18:00 MSK? We
>> can
>> > > use
>> > > > > the
>> > > > > > following waiting room link:
>> > > > > >
>> > https://zoom.us/j/99450012496?pwd=RWZmOGhCNWlRK0ZpamdOOTZsYTJ0dz09
>> > > > > >
>> > > > > > Let me know if this time works for everybody.
>> > > > > >
>> > > > > > ср, 25 нояб. 2020 г. в 16:42, Alexey Goncharuk <
>> > > > > alexey.goncharuk@gmail.com
>> > > > > > >:
>> > > > > >
>> > > > > > > Folks,
>> > > > > > >
>> > > > > > > I've made some edits in IEP-61 [1] regarding the group
>> membership
>> > > > > service
>> > > > > > > and transaction protocol interaction with the replication
>> > > > > infrastructure,
>> > > > > > > please take a look before our Friday call.
>> > > > > > >
>> > > > > > > [1]
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-61%3A+Common+Replication+Infrastructure
>> > > > > > >
>> > > > > > > пн, 23 нояб. 2020 г. в 13:28, Alexey Goncharuk <
>> > > > > > alexey.goncharuk@gmail.com
>> > > > > > > >:
>> > > > > > >
>> > > > > > >> Thanks, Ivan,
>> > > > > > >>
>> > > > > > >> Another protocol for group membership worth checking out is
>> > RAPID
>> > > > [1]
>> > > > > (a
>> > > > > > >> recent one). Not sure though if there are any available
>> > > > > implementations
>> > > > > > for
>> > > > > > >> it already.
>> > > > > > >>
>> > > > > > >> [1]
>> > > > > >
>> > > https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf
>> > > > > > >>
>> > > > > > >> пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <
>> > > ivandasch@gmail.com
>> > > > >:
>> > > > > > >>
>> > > > > > >>> Also, here is some interesting reading about gossip, SWIM
>> etc.
>> > > > > > >>>
>> > > > > > >>> 1 --
>> > > > > > >>>
>> > > > >
>> > http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
>> > > > > > >>> 2 --
>> > > > > > >>>
>> > > > > > >>>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
>> > > > > > >>> 3 -- https://github.com/hashicorp/memberlist (Foundation
>> > library
>> > > > of
>> > > > > > >>> hashicorp serf)
>> > > > > > >>> 4 -- https://github.com/scalecube/scalecube-cluster --
>> (Java
>> > > > > > >>> implementation
>> > > > > > >>> of SWIM)
>> > > > > > >>>
>> > > > > > >>> чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <
>> > > ivandasch@gmail.com
>> > > > >:
>> > > > > > >>>
>> > > > > > >>> > >> Friday, Nov 27th work for you? If ok, let's have an
>> open
>> > > call
>> > > > > > then.
>> > > > > > >>> > Yes, great
>> > > > > > >>> > >> As for the protocol port - we will not be dealing with
>> the
>> > > > > > >>> > concurrency...
>> > > > > > >>> > >>Judging by the Rust port, it seems fairly
>> straightforward.
>> > > > > > >>> > Yes, they chose split transport and logic. But original Go
>> > > > package
>> > > > > > from
>> > > > > > >>> > etcd (see raft/node.go) contains some  heartbeats
>> mechanism
>> > > etc.
>> > > > > > >>> > I agree with you, this seems not to be a huge deal to
>> port.
>> > > > > > >>> >
>> > > > > > >>> > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <
>> > > > > > >>> alexey.goncharuk@gmail.com
>> > > > > > >>> > >:
>> > > > > > >>> >
>> > > > > > >>> >> Ivan,
>> > > > > > >>> >>
>> > > > > > >>> >> Agree, let's have a call to discuss the IEP. I have some
>> > more
>> > > > > > thoughts
>> > > > > > >>> >> regarding how the replication infrastructure works with
>> > > > > > >>> >> atomic/transactional caches, will put this info to the
>> IEP.
>> > > Does
>> > > > > > next
>> > > > > > >>> >> Friday, Nov 27th work for you? If ok, let's have an open
>> > call
>> > > > > then.
>> > > > > > >>> >>
>> > > > > > >>> >> As for the protocol port - we will not be dealing with
>> the
>> > > > > > concurrency
>> > > > > > >>> >> model if we choose this way, this is what I like about
>> their
>> > > > code
>> > > > > > >>> >> structure. Essentially, the raft module is a
>> single-threaded
>> > > > > > automata
>> > > > > > >>> >> which
>> > > > > > >>> >> has a callback to process a message, process a tick
>> > (timeout)
>> > > > and
>> > > > > > >>> produces
>> > > > > > >>> >> messages that should be sent and log entries that should
>> be
>> > > > > > persisted.
>> > > > > > >>> >> Judging by the Rust port, it seems fairly
>> straightforward.
>> > > Will
>> > > > be
>> > > > > > >>> happy
>> > > > > > >>> >> to
>> > > > > > >>> >> discuss this and other alternatives on the call as well.
>> > > > > > >>> >>
>> > > > > > >>> >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <
>> > > > > ivandasch@gmail.com
>> > > > > > >:
>> > > > > > >>> >>
>> > > > > > >>> >> > > Any existing library that can be used to avoid
>> > > > re-implementing
>> > > > > > the
>> > > > > > >>> >> > protocol ourselves? Perhaps, porting the existing
>> > > > implementation
>> > > > > > to
>> > > > > > >>> Java
>> > > > > > >>> >> > Personally, I like this idea. Go libraries (either raft
>> > > module
>> > > > > of
>> > > > > > >>> etcd
>> > > > > > >>> >> or
>> > > > > > >>> >> > serf by Hashicorp) are famous for clean code, good
>> design,
>> > > > > > >>> stability,
>> > > > > > >>> >> not
>> > > > > > >>> >> > enormous size.
>> > > > > > >>> >> > But, on other side, Go has different model for
>> concurrency
>> > > and
>> > > > > > >>> porting
>> > > > > > >>> >> > probably will not be so straightforward.
>> > > > > > >>> >> >
>> > > > > > >>> >> >
>> > > > > > >>> >> >
>> > > > > > >>> >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <
>> > > > > > ivandasch@gmail.com
>> > > > > > >>> >:
>> > > > > > >>> >> >
>> > > > > > >>> >> > > I'd suggest to discuss this IEP and technical
>> details in
>> > > > open
>> > > > > > ZOOM
>> > > > > > >>> >> > > meeting.
>> > > > > > >>> >> > >
>> > > > > > >>> >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <
>> > > > > > >>> ivandasch@gmail.com>:
>> > > > > > >>> >> > >
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >> ---------- Forwarded message ---------
>> > > > > > >>> >> > >> От: Ivan Daschinsky <iv...@gmail.com>
>> > > > > > >>> >> > >> Date: чт, 19 нояб. 2020 г. в 13:02
>> > > > > > >>> >> > >> Subject: Re: IEP-61 Technical discussion
>> > > > > > >>> >> > >> To: Alexey Goncharuk <al...@gmail.com>
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >> Alexey, let's arise another question. Specifically,
>> how
>> > > > nodes
>> > > > > > >>> >> initially
>> > > > > > >>> >> > >> find each other (discovery) and how they detect
>> > failures.
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >> I suppose, that gossip protocol is an ideal
>> candidate.
>> > > For
>> > > > > > >>> example,
>> > > > > > >>> >> > >> consul [1] uses this approach, using serf [2]
>> library
>> > to
>> > > > > > discover
>> > > > > > >>> >> > members
>> > > > > > >>> >> > >> of cluster.
>> > > > > > >>> >> > >> Then consul forms raft ensemble (server nodes) and
>> > client
>> > > > use
>> > > > > > >>> raft
>> > > > > > >>> >> > >> ensemble only as lock service.
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >> PacificA suggests internal heartbeats mechanism for
>> > > failure
>> > > > > > >>> >> detection of
>> > > > > > >>> >> > >> replicated group, but it says nothing about initial
>> > > > discovery
>> > > > > > of
>> > > > > > >>> >> nodes.
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >> WDYT?
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >> [1] --
>> https://www.consul.io/docs/architecture/gossip
>> > > > > > >>> >> > >> [2] -- https://www.serf.io/
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
>> > > > > > >>> >> > >> alexey.goncharuk@gmail.com>:
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >>> Following up the Ignite 3.0 scope/development
>> approach
>> > > > > > threads,
>> > > > > > >>> >> this is
>> > > > > > >>> >> > >>> a separate thread to discuss technical aspects of
>> the
>> > > IEP.
>> > > > > > >>> >> > >>>
>> > > > > > >>> >> > >>> Let's reiterate one more time on the questions
>> raised
>> > by
>> > > > > Ivan
>> > > > > > >>> and
>> > > > > > >>> >> also
>> > > > > > >>> >> > >>> see if there are any other thoughts on the IEP:
>> > > > > > >>> >> > >>>
>> > > > > > >>> >> > >>>    - *Whether to deploy metastorage on a separate
>> > subset
>> > > > of
>> > > > > > the
>> > > > > > >>> >> nodes
>> > > > > > >>> >> > >>>    or allow Ignite to choose these nodes
>> > > automatically.* I
>> > > > > > >>> think it
>> > > > > > >>> >> is
>> > > > > > >>> >> > >>>    feasible to maintain both modes: by default,
>> Ignite
>> > > > will
>> > > > > > >>> choose
>> > > > > > >>> >> > >>>    metastorage nodes automatically which
>> essentially
>> > > will
>> > > > > > >>> provide
>> > > > > > >>> >> the
>> > > > > > >>> >> > same
>> > > > > > >>> >> > >>>    seamless user experience as TCP discovery SPI -
>> no
>> > > > > separate
>> > > > > > >>> >> roles,
>> > > > > > >>> >> > >>>    simplistic deployment. For deployments where
>> people
>> > > > want
>> > > > > to
>> > > > > > >>> have
>> > > > > > >>> >> > more
>> > > > > > >>> >> > >>>    fine-grained control over the nodes'
>> assignments,
>> > we
>> > > > will
>> > > > > > >>> >> provide a
>> > > > > > >>> >> > runtime
>> > > > > > >>> >> > >>>    configuration which will allow pinning
>> metastorage
>> > > > group
>> > > > > to
>> > > > > > >>> >> certain
>> > > > > > >>> >> > nodes,
>> > > > > > >>> >> > >>>    thus eliminating the latency concerns.
>> > > > > > >>> >> > >>>    - *Whether there are any TLA+ specs for the
>> > PacificA
>> > > > > > >>> protocol.*
>> > > > > > >>> >> Not
>> > > > > > >>> >> > >>>    to my knowledge, but it is known to be used in
>> > > > production
>> > > > > > by
>> > > > > > >>> >> > Microsoft and
>> > > > > > >>> >> > >>>    other projects, e.g. [1]
>> > > > > > >>> >> > >>>
>> > > > > > >>> >> > >>> I would like to collect general feedback on the
>> IEP,
>> > as
>> > > > well
>> > > > > > as
>> > > > > > >>> >> > feedback
>> > > > > > >>> >> > >>> on specific parts of it, such as:
>> > > > > > >>> >> > >>>
>> > > > > > >>> >> > >>>    - Metastorage API
>> > > > > > >>> >> > >>>    - Any existing library that can be used to avoid
>> > > > > > >>> re-implementing
>> > > > > > >>> >> the
>> > > > > > >>> >> > >>>    protocol ourselves? Perhaps, porting the
>> existing
>> > > > > > >>> implementation
>> > > > > > >>> >> to
>> > > > > > >>> >> > Java
>> > > > > > >>> >> > >>>    (the way TiKV did with etcd-raft [2] [3]? This
>> is a
>> > > > very
>> > > > > > >>> neat way
>> > > > > > >>> >> > btw in my
>> > > > > > >>> >> > >>>    opinion because I like the finite automata-like
>> > > > approach
>> > > > > of
>> > > > > > >>> the
>> > > > > > >>> >> > replication
>> > > > > > >>> >> > >>>    module, and, additionally, we could sync bug
>> fixes
>> > > and
>> > > > > > >>> >> improvements
>> > > > > > >>> >> > from
>> > > > > > >>> >> > >>>    the upstream project)
>> > > > > > >>> >> > >>>
>> > > > > > >>> >> > >>>
>> > > > > > >>> >> > >>> Thanks,
>> > > > > > >>> >> > >>> --AG
>> > > > > > >>> >> > >>>
>> > > > > > >>> >> > >>> [1]
>> > > > > > >>> >> > >>>
>> > > > > > >>> >>
>> > > > > >
>> > > https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
>> > > > > > >>> >> > >>> [2]
>> https://github.com/etcd-io/etcd/tree/master/raft
>> > > > > > >>> >> > >>> [3] https://github.com/tikv/raft-rs
>> > > > > > >>> >> > >>>
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >> --
>> > > > > > >>> >> > >> Sincerely yours, Ivan Daschinskiy
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >> --
>> > > > > > >>> >> > >> Sincerely yours, Ivan Daschinskiy
>> > > > > > >>> >> > >>
>> > > > > > >>> >> > >
>> > > > > > >>> >> > >
>> > > > > > >>> >> > > --
>> > > > > > >>> >> > > Sincerely yours, Ivan Daschinskiy
>> > > > > > >>> >> > >
>> > > > > > >>> >> >
>> > > > > > >>> >> >
>> > > > > > >>> >> > --
>> > > > > > >>> >> > Sincerely yours, Ivan Daschinskiy
>> > > > > > >>> >> >
>> > > > > > >>> >>
>> > > > > > >>> >
>> > > > > > >>> >
>> > > > > > >>> > --
>> > > > > > >>> > Sincerely yours, Ivan Daschinskiy
>> > > > > > >>> >
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> --
>> > > > > > >>> Sincerely yours, Ivan Daschinskiy
>> > > > > > >>>
>> > > > > > >>
>> > > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Sincerely yours, Ivan Daschinskiy
>> > > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Sincerely yours, Ivan Daschinskiy
>> > >
>> >
>>
>
>
> --
>
> Best regards,
> Alexei Scherbakov
>


-- 

Best regards,
Alexei Scherbakov

Re: IEP-61 Technical discussion

Posted by Alexei Scherbakov <al...@gmail.com>.

Folks,

I want to share some information about progress in implementing the raft
protocol in ignite 3, which is a prerequisite for metastorage.

The implementation will consist of client and server modules. The client is
responsible for interoperability between raft server node and any other
remote/local java process

I have recently finished a raft client API. The public API part is
available here [1] for review. The entry point is RaftGroupService
interface. The service implementation has not been finished yet and can be
skipped for now.

As for the server part, currently we are investigating two options. First
is etcd [2] implementation ported to Java. The drawback here is the amount
of work required to make it working. Second option is the adoption of
jraft [3] implementation. It is a full featured implementation already
written in Java, but the code is not quite clean in my opinion and will
require some refactoring.

The next step is to make a raft client working with server implementations.
At least one is required for the next alpha. It is planned to have the same
client for both server implementations. As soon as both will be ready, we
will compare them by running consistency tests and benchmarks and drop the
worst. I will give the next update when we will have a working client and
at least one server implementation ready.

[1] https://github.com/apache/ignite-3/pull/59/files
[2] https://github.com/etcd-io/etcd/tree/master/raft
[3] https://github.com/sofastack/sofa-jraft

пт, 27 нояб. 2020 г. в 20:26, Alexey Goncharuk <al...@gmail.com>:

> Folks, thanks to everyone who joined the call. Summary:
>
>    - We agree that it may be beneficial to separate metastorage and group
>    membership services, however, the abstractions should be clean enough so
>    that we could implement group membership via metastorage
>    - Production cluster setup will involve an administrator 'init' command
>    that will initialize the metastorage raft group. Once the metastorage is
>    initialized, all nodes may be restarted arbitrarily
>    - HA cluster must contain at least 3 nodes. 2-node cluster will stop
>    progress when one of the nodes fails (due to metastorage requirements)
>    - We will provide a 'developer' cluster mode which will allow a 1-node
>    setup and auto-initialization without the 'init' command
>    - We are targeting centralized affinity calculation that will be stored
>    to the metastorage. Metastorage downtime does not necessarily mean
> cluster
>    availability (subject to the partition replication protocol choice). It
>    would be good to maximally hide the partition object so that we could
>    support range partitioning in the future
>
> To discuss at the next meeting (do not hesitate to send questions here
> before the meeting):
>
>    - Raft implementation details (API model, porting, etc)
>    - Transactions interaction with replication protocol
>    - Weaker consistency options
>
> Please add more if I forgot something and let's choose a time for the next
> meeting.
>
> --AG
>
> чт, 26 нояб. 2020 г. в 16:12, Kseniya Romanova <romanova.ks.spb@gmail.com
> >:
>
> > Done
> >
> > чт, 26 нояб. 2020 г. в 13:18, Ivan Daschinsky <iv...@gmail.com>:
> >
> > > Alexey, is it possible to manage call at 16:00 MSK?
> > >
> > > чт, 26 нояб. 2020 г. в 12:30, Alexey Goncharuk <
> > alexey.goncharuk@gmail.com
> > > >:
> > >
> > > > Hi Ivan,
> > > >
> > > > Unfortunately, the earliest window available for us is 12:00 MSK (1
> > hour
> > > > slot), or after 14:30 MSK. Let me know what time works best for you.
> > > >
> > > > ср, 25 нояб. 2020 г. в 21:38, Ivan Daschinsky <iv...@gmail.com>:
> > > >
> > > > > Alexey, I kindly ask you to move the meeting a little bit earlier,
> > > ideal
> > > > > variant -- in the morning.
> > > > >
> > > > > ср, 25 нояб. 2020 г. в 20:10, Alexey Goncharuk <
> > > > alexey.goncharuk@gmail.com
> > > > > >:
> > > > >
> > > > > > Folks, let's have the call on Friday, Nov 27th at 18:00 MSK? We
> can
> > > use
> > > > > the
> > > > > > following waiting room link:
> > > > > >
> > https://zoom.us/j/99450012496?pwd=RWZmOGhCNWlRK0ZpamdOOTZsYTJ0dz09
> > > > > >
> > > > > > Let me know if this time works for everybody.
> > > > > >
> > > > > > ср, 25 нояб. 2020 г. в 16:42, Alexey Goncharuk <
> > > > > alexey.goncharuk@gmail.com
> > > > > > >:
> > > > > >
> > > > > > > Folks,
> > > > > > >
> > > > > > > I've made some edits in IEP-61 [1] regarding the group
> membership
> > > > > service
> > > > > > > and transaction protocol interaction with the replication
> > > > > infrastructure,
> > > > > > > please take a look before our Friday call.
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-61%3A+Common+Replication+Infrastructure
> > > > > > >
> > > > > > > пн, 23 нояб. 2020 г. в 13:28, Alexey Goncharuk <
> > > > > > alexey.goncharuk@gmail.com
> > > > > > > >:
> > > > > > >
> > > > > > >> Thanks, Ivan,
> > > > > > >>
> > > > > > >> Another protocol for group membership worth checking out is
> > RAPID
> > > > [1]
> > > > > (a
> > > > > > >> recent one). Not sure though if there are any available
> > > > > implementations
> > > > > > for
> > > > > > >> it already.
> > > > > > >>
> > > > > > >> [1]
> > > > > >
> > > https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf
> > > > > > >>
> > > > > > >> пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <
> > > ivandasch@gmail.com
> > > > >:
> > > > > > >>
> > > > > > >>> Also, here is some interesting reading about gossip, SWIM
> etc.
> > > > > > >>>
> > > > > > >>> 1 --
> > > > > > >>>
> > > > >
> > http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
> > > > > > >>> 2 --
> > > > > > >>>
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
> > > > > > >>> 3 -- https://github.com/hashicorp/memberlist (Foundation
> > library
> > > > of
> > > > > > >>> hashicorp serf)
> > > > > > >>> 4 -- https://github.com/scalecube/scalecube-cluster -- (Java
> > > > > > >>> implementation
> > > > > > >>> of SWIM)
> > > > > > >>>
> > > > > > >>> чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <
> > > ivandasch@gmail.com
> > > > >:
> > > > > > >>>
> > > > > > >>> > >> Friday, Nov 27th work for you? If ok, let's have an open
> > > call
> > > > > > then.
> > > > > > >>> > Yes, great
> > > > > > >>> > >> As for the protocol port - we will not be dealing with
> the
> > > > > > >>> > concurrency...
> > > > > > >>> > >>Judging by the Rust port, it seems fairly
> straightforward.
> > > > > > >>> > Yes, they chose split transport and logic. But original Go
> > > > package
> > > > > > from
> > > > > > >>> > etcd (see raft/node.go) contains some  heartbeats mechanism
> > > etc.
> > > > > > >>> > I agree with you, this seems not to be a huge deal to port.
> > > > > > >>> >
> > > > > > >>> > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <
> > > > > > >>> alexey.goncharuk@gmail.com
> > > > > > >>> > >:
> > > > > > >>> >
> > > > > > >>> >> Ivan,
> > > > > > >>> >>
> > > > > > >>> >> Agree, let's have a call to discuss the IEP. I have some
> > more
> > > > > > thoughts
> > > > > > >>> >> regarding how the replication infrastructure works with
> > > > > > >>> >> atomic/transactional caches, will put this info to the
> IEP.
> > > Does
> > > > > > next
> > > > > > >>> >> Friday, Nov 27th work for you? If ok, let's have an open
> > call
> > > > > then.
> > > > > > >>> >>
> > > > > > >>> >> As for the protocol port - we will not be dealing with the
> > > > > > concurrency
> > > > > > >>> >> model if we choose this way, this is what I like about
> their
> > > > code
> > > > > > >>> >> structure. Essentially, the raft module is a
> single-threaded
> > > > > > automata
> > > > > > >>> >> which
> > > > > > >>> >> has a callback to process a message, process a tick
> > (timeout)
> > > > and
> > > > > > >>> produces
> > > > > > >>> >> messages that should be sent and log entries that should
> be
> > > > > > persisted.
> > > > > > >>> >> Judging by the Rust port, it seems fairly straightforward.
> > > Will
> > > > be
> > > > > > >>> happy
> > > > > > >>> >> to
> > > > > > >>> >> discuss this and other alternatives on the call as well.
> > > > > > >>> >>
> > > > > > >>> >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <
> > > > > ivandasch@gmail.com
> > > > > > >:
> > > > > > >>> >>
> > > > > > >>> >> > > Any existing library that can be used to avoid
> > > > re-implementing
> > > > > > the
> > > > > > >>> >> > protocol ourselves? Perhaps, porting the existing
> > > > implementation
> > > > > > to
> > > > > > >>> Java
> > > > > > >>> >> > Personally, I like this idea. Go libraries (either raft
> > > module
> > > > > of
> > > > > > >>> etcd
> > > > > > >>> >> or
> > > > > > >>> >> > serf by Hashicorp) are famous for clean code, good
> design,
> > > > > > >>> stability,
> > > > > > >>> >> not
> > > > > > >>> >> > enormous size.
> > > > > > >>> >> > But, on other side, Go has different model for
> concurrency
> > > and
> > > > > > >>> porting
> > > > > > >>> >> > probably will not be so straightforward.
> > > > > > >>> >> >
> > > > > > >>> >> >
> > > > > > >>> >> >
> > > > > > >>> >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <
> > > > > > ivandasch@gmail.com
> > > > > > >>> >:
> > > > > > >>> >> >
> > > > > > >>> >> > > I'd suggest to discuss this IEP and technical details
> in
> > > > open
> > > > > > ZOOM
> > > > > > >>> >> > > meeting.
> > > > > > >>> >> > >
> > > > > > >>> >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <
> > > > > > >>> ivandasch@gmail.com>:
> > > > > > >>> >> > >
> > > > > > >>> >> > >>
> > > > > > >>> >> > >>
> > > > > > >>> >> > >> ---------- Forwarded message ---------
> > > > > > >>> >> > >> От: Ivan Daschinsky <iv...@gmail.com>
> > > > > > >>> >> > >> Date: чт, 19 нояб. 2020 г. в 13:02
> > > > > > >>> >> > >> Subject: Re: IEP-61 Technical discussion
> > > > > > >>> >> > >> To: Alexey Goncharuk <al...@gmail.com>
> > > > > > >>> >> > >>
> > > > > > >>> >> > >>
> > > > > > >>> >> > >> Alexey, let's arise another question. Specifically,
> how
> > > > nodes
> > > > > > >>> >> initially
> > > > > > >>> >> > >> find each other (discovery) and how they detect
> > failures.
> > > > > > >>> >> > >>
> > > > > > >>> >> > >> I suppose, that gossip protocol is an ideal
> candidate.
> > > For
> > > > > > >>> example,
> > > > > > >>> >> > >> consul [1] uses this approach, using serf [2] library
> > to
> > > > > > discover
> > > > > > >>> >> > members
> > > > > > >>> >> > >> of cluster.
> > > > > > >>> >> > >> Then consul forms raft ensemble (server nodes) and
> > client
> > > > use
> > > > > > >>> raft
> > > > > > >>> >> > >> ensemble only as lock service.
> > > > > > >>> >> > >>
> > > > > > >>> >> > >> PacificA suggests internal heartbeats mechanism for
> > > failure
> > > > > > >>> >> detection of
> > > > > > >>> >> > >> replicated group, but it says nothing about initial
> > > > discovery
> > > > > > of
> > > > > > >>> >> nodes.
> > > > > > >>> >> > >>
> > > > > > >>> >> > >> WDYT?
> > > > > > >>> >> > >>
> > > > > > >>> >> > >> [1] --
> https://www.consul.io/docs/architecture/gossip
> > > > > > >>> >> > >> [2] -- https://www.serf.io/
> > > > > > >>> >> > >>
> > > > > > >>> >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
> > > > > > >>> >> > >> alexey.goncharuk@gmail.com>:
> > > > > > >>> >> > >>
> > > > > > >>> >> > >>> Following up the Ignite 3.0 scope/development
> approach
> > > > > > threads,
> > > > > > >>> >> this is
> > > > > > >>> >> > >>> a separate thread to discuss technical aspects of
> the
> > > IEP.
> > > > > > >>> >> > >>>
> > > > > > >>> >> > >>> Let's reiterate one more time on the questions
> raised
> > by
> > > > > Ivan
> > > > > > >>> and
> > > > > > >>> >> also
> > > > > > >>> >> > >>> see if there are any other thoughts on the IEP:
> > > > > > >>> >> > >>>
> > > > > > >>> >> > >>>    - *Whether to deploy metastorage on a separate
> > subset
> > > > of
> > > > > > the
> > > > > > >>> >> nodes
> > > > > > >>> >> > >>>    or allow Ignite to choose these nodes
> > > automatically.* I
> > > > > > >>> think it
> > > > > > >>> >> is
> > > > > > >>> >> > >>>    feasible to maintain both modes: by default,
> Ignite
> > > > will
> > > > > > >>> choose
> > > > > > >>> >> > >>>    metastorage nodes automatically which essentially
> > > will
> > > > > > >>> provide
> > > > > > >>> >> the
> > > > > > >>> >> > same
> > > > > > >>> >> > >>>    seamless user experience as TCP discovery SPI -
> no
> > > > > separate
> > > > > > >>> >> roles,
> > > > > > >>> >> > >>>    simplistic deployment. For deployments where
> people
> > > > want
> > > > > to
> > > > > > >>> have
> > > > > > >>> >> > more
> > > > > > >>> >> > >>>    fine-grained control over the nodes' assignments,
> > we
> > > > will
> > > > > > >>> >> provide a
> > > > > > >>> >> > runtime
> > > > > > >>> >> > >>>    configuration which will allow pinning
> metastorage
> > > > group
> > > > > to
> > > > > > >>> >> certain
> > > > > > >>> >> > nodes,
> > > > > > >>> >> > >>>    thus eliminating the latency concerns.
> > > > > > >>> >> > >>>    - *Whether there are any TLA+ specs for the
> > PacificA
> > > > > > >>> protocol.*
> > > > > > >>> >> Not
> > > > > > >>> >> > >>>    to my knowledge, but it is known to be used in
> > > > production
> > > > > > by
> > > > > > >>> >> > Microsoft and
> > > > > > >>> >> > >>>    other projects, e.g. [1]
> > > > > > >>> >> > >>>
> > > > > > >>> >> > >>> I would like to collect general feedback on the IEP,
> > as
> > > > well
> > > > > > as
> > > > > > >>> >> > feedback
> > > > > > >>> >> > >>> on specific parts of it, such as:
> > > > > > >>> >> > >>>
> > > > > > >>> >> > >>>    - Metastorage API
> > > > > > >>> >> > >>>    - Any existing library that can be used to avoid
> > > > > > >>> re-implementing
> > > > > > >>> >> the
> > > > > > >>> >> > >>>    protocol ourselves? Perhaps, porting the existing
> > > > > > >>> implementation
> > > > > > >>> >> to
> > > > > > >>> >> > Java
> > > > > > >>> >> > >>>    (the way TiKV did with etcd-raft [2] [3]? This
> is a
> > > > very
> > > > > > >>> neat way
> > > > > > >>> >> > btw in my
> > > > > > >>> >> > >>>    opinion because I like the finite automata-like
> > > > approach
> > > > > of
> > > > > > >>> the
> > > > > > >>> >> > replication
> > > > > > >>> >> > >>>    module, and, additionally, we could sync bug
> fixes
> > > and
> > > > > > >>> >> improvements
> > > > > > >>> >> > from
> > > > > > >>> >> > >>>    the upstream project)
> > > > > > >>> >> > >>>
> > > > > > >>> >> > >>>
> > > > > > >>> >> > >>> Thanks,
> > > > > > >>> >> > >>> --AG
> > > > > > >>> >> > >>>
> > > > > > >>> >> > >>> [1]
> > > > > > >>> >> > >>>
> > > > > > >>> >>
> > > > > >
> > > https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
> > > > > > >>> >> > >>> [2]
> https://github.com/etcd-io/etcd/tree/master/raft
> > > > > > >>> >> > >>> [3] https://github.com/tikv/raft-rs
> > > > > > >>> >> > >>>
> > > > > > >>> >> > >>
> > > > > > >>> >> > >>
> > > > > > >>> >> > >> --
> > > > > > >>> >> > >> Sincerely yours, Ivan Daschinskiy
> > > > > > >>> >> > >>
> > > > > > >>> >> > >>
> > > > > > >>> >> > >> --
> > > > > > >>> >> > >> Sincerely yours, Ivan Daschinskiy
> > > > > > >>> >> > >>
> > > > > > >>> >> > >
> > > > > > >>> >> > >
> > > > > > >>> >> > > --
> > > > > > >>> >> > > Sincerely yours, Ivan Daschinskiy
> > > > > > >>> >> > >
> > > > > > >>> >> >
> > > > > > >>> >> >
> > > > > > >>> >> > --
> > > > > > >>> >> > Sincerely yours, Ivan Daschinskiy
> > > > > > >>> >> >
> > > > > > >>> >>
> > > > > > >>> >
> > > > > > >>> >
> > > > > > >>> > --
> > > > > > >>> > Sincerely yours, Ivan Daschinskiy
> > > > > > >>> >
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> --
> > > > > > >>> Sincerely yours, Ivan Daschinskiy
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Sincerely yours, Ivan Daschinskiy
> > > > >
> > > >
> > >
> > >
> > > --
> > > Sincerely yours, Ivan Daschinskiy
> > >
> >
>


-- 

Best regards,
Alexei Scherbakov

Re: IEP-61 Technical discussion

Posted by Alexey Goncharuk <al...@gmail.com>.

Folks, thanks to everyone who joined the call. Summary:

   - We agree that it may be beneficial to separate metastorage and group
   membership services, however, the abstractions should be clean enough so
   that we could implement group membership via metastorage
   - Production cluster setup will involve an administrator 'init' command
   that will initialize the metastorage raft group. Once the metastorage is
   initialized, all nodes may be restarted arbitrarily
   - HA cluster must contain at least 3 nodes. 2-node cluster will stop
   progress when one of the nodes fails (due to metastorage requirements)
   - We will provide a 'developer' cluster mode which will allow a 1-node
   setup and auto-initialization without the 'init' command
   - We are targeting centralized affinity calculation that will be stored
   to the metastorage. Metastorage downtime does not necessarily mean cluster
   availability (subject to the partition replication protocol choice). It
   would be good to maximally hide the partition object so that we could
   support range partitioning in the future

To discuss at the next meeting (do not hesitate to send questions here
before the meeting):

   - Raft implementation details (API model, porting, etc)
   - Transactions interaction with replication protocol
   - Weaker consistency options

Please add more if I forgot something and let's choose a time for the next
meeting.

--AG

чт, 26 нояб. 2020 г. в 16:12, Kseniya Romanova <ro...@gmail.com>:

> Done
>
> чт, 26 нояб. 2020 г. в 13:18, Ivan Daschinsky <iv...@gmail.com>:
>
> > Alexey, is it possible to manage call at 16:00 MSK?
> >
> > чт, 26 нояб. 2020 г. в 12:30, Alexey Goncharuk <
> alexey.goncharuk@gmail.com
> > >:
> >
> > > Hi Ivan,
> > >
> > > Unfortunately, the earliest window available for us is 12:00 MSK (1
> hour
> > > slot), or after 14:30 MSK. Let me know what time works best for you.
> > >
> > > ср, 25 нояб. 2020 г. в 21:38, Ivan Daschinsky <iv...@gmail.com>:
> > >
> > > > Alexey, I kindly ask you to move the meeting a little bit earlier,
> > ideal
> > > > variant -- in the morning.
> > > >
> > > > ср, 25 нояб. 2020 г. в 20:10, Alexey Goncharuk <
> > > alexey.goncharuk@gmail.com
> > > > >:
> > > >
> > > > > Folks, let's have the call on Friday, Nov 27th at 18:00 MSK? We can
> > use
> > > > the
> > > > > following waiting room link:
> > > > >
> https://zoom.us/j/99450012496?pwd=RWZmOGhCNWlRK0ZpamdOOTZsYTJ0dz09
> > > > >
> > > > > Let me know if this time works for everybody.
> > > > >
> > > > > ср, 25 нояб. 2020 г. в 16:42, Alexey Goncharuk <
> > > > alexey.goncharuk@gmail.com
> > > > > >:
> > > > >
> > > > > > Folks,
> > > > > >
> > > > > > I've made some edits in IEP-61 [1] regarding the group membership
> > > > service
> > > > > > and transaction protocol interaction with the replication
> > > > infrastructure,
> > > > > > please take a look before our Friday call.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-61%3A+Common+Replication+Infrastructure
> > > > > >
> > > > > > пн, 23 нояб. 2020 г. в 13:28, Alexey Goncharuk <
> > > > > alexey.goncharuk@gmail.com
> > > > > > >:
> > > > > >
> > > > > >> Thanks, Ivan,
> > > > > >>
> > > > > >> Another protocol for group membership worth checking out is
> RAPID
> > > [1]
> > > > (a
> > > > > >> recent one). Not sure though if there are any available
> > > > implementations
> > > > > for
> > > > > >> it already.
> > > > > >>
> > > > > >> [1]
> > > > >
> > https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf
> > > > > >>
> > > > > >> пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <
> > ivandasch@gmail.com
> > > >:
> > > > > >>
> > > > > >>> Also, here is some interesting reading about gossip, SWIM etc.
> > > > > >>>
> > > > > >>> 1 --
> > > > > >>>
> > > >
> http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
> > > > > >>> 2 --
> > > > > >>>
> > > > > >>>
> > > > >
> > > >
> > >
> >
> http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
> > > > > >>> 3 -- https://github.com/hashicorp/memberlist (Foundation
> library
> > > of
> > > > > >>> hashicorp serf)
> > > > > >>> 4 -- https://github.com/scalecube/scalecube-cluster -- (Java
> > > > > >>> implementation
> > > > > >>> of SWIM)
> > > > > >>>
> > > > > >>> чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <
> > ivandasch@gmail.com
> > > >:
> > > > > >>>
> > > > > >>> > >> Friday, Nov 27th work for you? If ok, let's have an open
> > call
> > > > > then.
> > > > > >>> > Yes, great
> > > > > >>> > >> As for the protocol port - we will not be dealing with the
> > > > > >>> > concurrency...
> > > > > >>> > >>Judging by the Rust port, it seems fairly straightforward.
> > > > > >>> > Yes, they chose split transport and logic. But original Go
> > > package
> > > > > from
> > > > > >>> > etcd (see raft/node.go) contains some  heartbeats mechanism
> > etc.
> > > > > >>> > I agree with you, this seems not to be a huge deal to port.
> > > > > >>> >
> > > > > >>> > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <
> > > > > >>> alexey.goncharuk@gmail.com
> > > > > >>> > >:
> > > > > >>> >
> > > > > >>> >> Ivan,
> > > > > >>> >>
> > > > > >>> >> Agree, let's have a call to discuss the IEP. I have some
> more
> > > > > thoughts
> > > > > >>> >> regarding how the replication infrastructure works with
> > > > > >>> >> atomic/transactional caches, will put this info to the IEP.
> > Does
> > > > > next
> > > > > >>> >> Friday, Nov 27th work for you? If ok, let's have an open
> call
> > > > then.
> > > > > >>> >>
> > > > > >>> >> As for the protocol port - we will not be dealing with the
> > > > > concurrency
> > > > > >>> >> model if we choose this way, this is what I like about their
> > > code
> > > > > >>> >> structure. Essentially, the raft module is a single-threaded
> > > > > automata
> > > > > >>> >> which
> > > > > >>> >> has a callback to process a message, process a tick
> (timeout)
> > > and
> > > > > >>> produces
> > > > > >>> >> messages that should be sent and log entries that should be
> > > > > persisted.
> > > > > >>> >> Judging by the Rust port, it seems fairly straightforward.
> > Will
> > > be
> > > > > >>> happy
> > > > > >>> >> to
> > > > > >>> >> discuss this and other alternatives on the call as well.
> > > > > >>> >>
> > > > > >>> >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <
> > > > ivandasch@gmail.com
> > > > > >:
> > > > > >>> >>
> > > > > >>> >> > > Any existing library that can be used to avoid
> > > re-implementing
> > > > > the
> > > > > >>> >> > protocol ourselves? Perhaps, porting the existing
> > > implementation
> > > > > to
> > > > > >>> Java
> > > > > >>> >> > Personally, I like this idea. Go libraries (either raft
> > module
> > > > of
> > > > > >>> etcd
> > > > > >>> >> or
> > > > > >>> >> > serf by Hashicorp) are famous for clean code, good design,
> > > > > >>> stability,
> > > > > >>> >> not
> > > > > >>> >> > enormous size.
> > > > > >>> >> > But, on other side, Go has different model for concurrency
> > and
> > > > > >>> porting
> > > > > >>> >> > probably will not be so straightforward.
> > > > > >>> >> >
> > > > > >>> >> >
> > > > > >>> >> >
> > > > > >>> >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <
> > > > > ivandasch@gmail.com
> > > > > >>> >:
> > > > > >>> >> >
> > > > > >>> >> > > I'd suggest to discuss this IEP and technical details in
> > > open
> > > > > ZOOM
> > > > > >>> >> > > meeting.
> > > > > >>> >> > >
> > > > > >>> >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <
> > > > > >>> ivandasch@gmail.com>:
> > > > > >>> >> > >
> > > > > >>> >> > >>
> > > > > >>> >> > >>
> > > > > >>> >> > >> ---------- Forwarded message ---------
> > > > > >>> >> > >> От: Ivan Daschinsky <iv...@gmail.com>
> > > > > >>> >> > >> Date: чт, 19 нояб. 2020 г. в 13:02
> > > > > >>> >> > >> Subject: Re: IEP-61 Technical discussion
> > > > > >>> >> > >> To: Alexey Goncharuk <al...@gmail.com>
> > > > > >>> >> > >>
> > > > > >>> >> > >>
> > > > > >>> >> > >> Alexey, let's arise another question. Specifically, how
> > > nodes
> > > > > >>> >> initially
> > > > > >>> >> > >> find each other (discovery) and how they detect
> failures.
> > > > > >>> >> > >>
> > > > > >>> >> > >> I suppose, that gossip protocol is an ideal candidate.
> > For
> > > > > >>> example,
> > > > > >>> >> > >> consul [1] uses this approach, using serf [2] library
> to
> > > > > discover
> > > > > >>> >> > members
> > > > > >>> >> > >> of cluster.
> > > > > >>> >> > >> Then consul forms raft ensemble (server nodes) and
> client
> > > use
> > > > > >>> raft
> > > > > >>> >> > >> ensemble only as lock service.
> > > > > >>> >> > >>
> > > > > >>> >> > >> PacificA suggests internal heartbeats mechanism for
> > failure
> > > > > >>> >> detection of
> > > > > >>> >> > >> replicated group, but it says nothing about initial
> > > discovery
> > > > > of
> > > > > >>> >> nodes.
> > > > > >>> >> > >>
> > > > > >>> >> > >> WDYT?
> > > > > >>> >> > >>
> > > > > >>> >> > >> [1] -- https://www.consul.io/docs/architecture/gossip
> > > > > >>> >> > >> [2] -- https://www.serf.io/
> > > > > >>> >> > >>
> > > > > >>> >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
> > > > > >>> >> > >> alexey.goncharuk@gmail.com>:
> > > > > >>> >> > >>
> > > > > >>> >> > >>> Following up the Ignite 3.0 scope/development approach
> > > > > threads,
> > > > > >>> >> this is
> > > > > >>> >> > >>> a separate thread to discuss technical aspects of the
> > IEP.
> > > > > >>> >> > >>>
> > > > > >>> >> > >>> Let's reiterate one more time on the questions raised
> by
> > > > Ivan
> > > > > >>> and
> > > > > >>> >> also
> > > > > >>> >> > >>> see if there are any other thoughts on the IEP:
> > > > > >>> >> > >>>
> > > > > >>> >> > >>>    - *Whether to deploy metastorage on a separate
> subset
> > > of
> > > > > the
> > > > > >>> >> nodes
> > > > > >>> >> > >>>    or allow Ignite to choose these nodes
> > automatically.* I
> > > > > >>> think it
> > > > > >>> >> is
> > > > > >>> >> > >>>    feasible to maintain both modes: by default, Ignite
> > > will
> > > > > >>> choose
> > > > > >>> >> > >>>    metastorage nodes automatically which essentially
> > will
> > > > > >>> provide
> > > > > >>> >> the
> > > > > >>> >> > same
> > > > > >>> >> > >>>    seamless user experience as TCP discovery SPI - no
> > > > separate
> > > > > >>> >> roles,
> > > > > >>> >> > >>>    simplistic deployment. For deployments where people
> > > want
> > > > to
> > > > > >>> have
> > > > > >>> >> > more
> > > > > >>> >> > >>>    fine-grained control over the nodes' assignments,
> we
> > > will
> > > > > >>> >> provide a
> > > > > >>> >> > runtime
> > > > > >>> >> > >>>    configuration which will allow pinning metastorage
> > > group
> > > > to
> > > > > >>> >> certain
> > > > > >>> >> > nodes,
> > > > > >>> >> > >>>    thus eliminating the latency concerns.
> > > > > >>> >> > >>>    - *Whether there are any TLA+ specs for the
> PacificA
> > > > > >>> protocol.*
> > > > > >>> >> Not
> > > > > >>> >> > >>>    to my knowledge, but it is known to be used in
> > > production
> > > > > by
> > > > > >>> >> > Microsoft and
> > > > > >>> >> > >>>    other projects, e.g. [1]
> > > > > >>> >> > >>>
> > > > > >>> >> > >>> I would like to collect general feedback on the IEP,
> as
> > > well
> > > > > as
> > > > > >>> >> > feedback
> > > > > >>> >> > >>> on specific parts of it, such as:
> > > > > >>> >> > >>>
> > > > > >>> >> > >>>    - Metastorage API
> > > > > >>> >> > >>>    - Any existing library that can be used to avoid
> > > > > >>> re-implementing
> > > > > >>> >> the
> > > > > >>> >> > >>>    protocol ourselves? Perhaps, porting the existing
> > > > > >>> implementation
> > > > > >>> >> to
> > > > > >>> >> > Java
> > > > > >>> >> > >>>    (the way TiKV did with etcd-raft [2] [3]? This is a
> > > very
> > > > > >>> neat way
> > > > > >>> >> > btw in my
> > > > > >>> >> > >>>    opinion because I like the finite automata-like
> > > approach
> > > > of
> > > > > >>> the
> > > > > >>> >> > replication
> > > > > >>> >> > >>>    module, and, additionally, we could sync bug fixes
> > and
> > > > > >>> >> improvements
> > > > > >>> >> > from
> > > > > >>> >> > >>>    the upstream project)
> > > > > >>> >> > >>>
> > > > > >>> >> > >>>
> > > > > >>> >> > >>> Thanks,
> > > > > >>> >> > >>> --AG
> > > > > >>> >> > >>>
> > > > > >>> >> > >>> [1]
> > > > > >>> >> > >>>
> > > > > >>> >>
> > > > >
> > https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
> > > > > >>> >> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
> > > > > >>> >> > >>> [3] https://github.com/tikv/raft-rs
> > > > > >>> >> > >>>
> > > > > >>> >> > >>
> > > > > >>> >> > >>
> > > > > >>> >> > >> --
> > > > > >>> >> > >> Sincerely yours, Ivan Daschinskiy
> > > > > >>> >> > >>
> > > > > >>> >> > >>
> > > > > >>> >> > >> --
> > > > > >>> >> > >> Sincerely yours, Ivan Daschinskiy
> > > > > >>> >> > >>
> > > > > >>> >> > >
> > > > > >>> >> > >
> > > > > >>> >> > > --
> > > > > >>> >> > > Sincerely yours, Ivan Daschinskiy
> > > > > >>> >> > >
> > > > > >>> >> >
> > > > > >>> >> >
> > > > > >>> >> > --
> > > > > >>> >> > Sincerely yours, Ivan Daschinskiy
> > > > > >>> >> >
> > > > > >>> >>
> > > > > >>> >
> > > > > >>> >
> > > > > >>> > --
> > > > > >>> > Sincerely yours, Ivan Daschinskiy
> > > > > >>> >
> > > > > >>>
> > > > > >>>
> > > > > >>> --
> > > > > >>> Sincerely yours, Ivan Daschinskiy
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > > >
> > > > --
> > > > Sincerely yours, Ivan Daschinskiy
> > > >
> > >
> >
> >
> > --
> > Sincerely yours, Ivan Daschinskiy
> >
>

Re: IEP-61 Technical discussion

Posted by Kseniya Romanova <ro...@gmail.com>.

Done

чт, 26 нояб. 2020 г. в 13:18, Ivan Daschinsky <iv...@gmail.com>:

> Alexey, is it possible to manage call at 16:00 MSK?
>
> чт, 26 нояб. 2020 г. в 12:30, Alexey Goncharuk <alexey.goncharuk@gmail.com
> >:
>
> > Hi Ivan,
> >
> > Unfortunately, the earliest window available for us is 12:00 MSK (1 hour
> > slot), or after 14:30 MSK. Let me know what time works best for you.
> >
> > ср, 25 нояб. 2020 г. в 21:38, Ivan Daschinsky <iv...@gmail.com>:
> >
> > > Alexey, I kindly ask you to move the meeting a little bit earlier,
> ideal
> > > variant -- in the morning.
> > >
> > > ср, 25 нояб. 2020 г. в 20:10, Alexey Goncharuk <
> > alexey.goncharuk@gmail.com
> > > >:
> > >
> > > > Folks, let's have the call on Friday, Nov 27th at 18:00 MSK? We can
> use
> > > the
> > > > following waiting room link:
> > > >  https://zoom.us/j/99450012496?pwd=RWZmOGhCNWlRK0ZpamdOOTZsYTJ0dz09
> > > >
> > > > Let me know if this time works for everybody.
> > > >
> > > > ср, 25 нояб. 2020 г. в 16:42, Alexey Goncharuk <
> > > alexey.goncharuk@gmail.com
> > > > >:
> > > >
> > > > > Folks,
> > > > >
> > > > > I've made some edits in IEP-61 [1] regarding the group membership
> > > service
> > > > > and transaction protocol interaction with the replication
> > > infrastructure,
> > > > > please take a look before our Friday call.
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-61%3A+Common+Replication+Infrastructure
> > > > >
> > > > > пн, 23 нояб. 2020 г. в 13:28, Alexey Goncharuk <
> > > > alexey.goncharuk@gmail.com
> > > > > >:
> > > > >
> > > > >> Thanks, Ivan,
> > > > >>
> > > > >> Another protocol for group membership worth checking out is RAPID
> > [1]
> > > (a
> > > > >> recent one). Not sure though if there are any available
> > > implementations
> > > > for
> > > > >> it already.
> > > > >>
> > > > >> [1]
> > > >
> https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf
> > > > >>
> > > > >> пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <
> ivandasch@gmail.com
> > >:
> > > > >>
> > > > >>> Also, here is some interesting reading about gossip, SWIM etc.
> > > > >>>
> > > > >>> 1 --
> > > > >>>
> > > http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
> > > > >>> 2 --
> > > > >>>
> > > > >>>
> > > >
> > >
> >
> http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
> > > > >>> 3 -- https://github.com/hashicorp/memberlist (Foundation library
> > of
> > > > >>> hashicorp serf)
> > > > >>> 4 -- https://github.com/scalecube/scalecube-cluster -- (Java
> > > > >>> implementation
> > > > >>> of SWIM)
> > > > >>>
> > > > >>> чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <
> ivandasch@gmail.com
> > >:
> > > > >>>
> > > > >>> > >> Friday, Nov 27th work for you? If ok, let's have an open
> call
> > > > then.
> > > > >>> > Yes, great
> > > > >>> > >> As for the protocol port - we will not be dealing with the
> > > > >>> > concurrency...
> > > > >>> > >>Judging by the Rust port, it seems fairly straightforward.
> > > > >>> > Yes, they chose split transport and logic. But original Go
> > package
> > > > from
> > > > >>> > etcd (see raft/node.go) contains some  heartbeats mechanism
> etc.
> > > > >>> > I agree with you, this seems not to be a huge deal to port.
> > > > >>> >
> > > > >>> > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <
> > > > >>> alexey.goncharuk@gmail.com
> > > > >>> > >:
> > > > >>> >
> > > > >>> >> Ivan,
> > > > >>> >>
> > > > >>> >> Agree, let's have a call to discuss the IEP. I have some more
> > > > thoughts
> > > > >>> >> regarding how the replication infrastructure works with
> > > > >>> >> atomic/transactional caches, will put this info to the IEP.
> Does
> > > > next
> > > > >>> >> Friday, Nov 27th work for you? If ok, let's have an open call
> > > then.
> > > > >>> >>
> > > > >>> >> As for the protocol port - we will not be dealing with the
> > > > concurrency
> > > > >>> >> model if we choose this way, this is what I like about their
> > code
> > > > >>> >> structure. Essentially, the raft module is a single-threaded
> > > > automata
> > > > >>> >> which
> > > > >>> >> has a callback to process a message, process a tick (timeout)
> > and
> > > > >>> produces
> > > > >>> >> messages that should be sent and log entries that should be
> > > > persisted.
> > > > >>> >> Judging by the Rust port, it seems fairly straightforward.
> Will
> > be
> > > > >>> happy
> > > > >>> >> to
> > > > >>> >> discuss this and other alternatives on the call as well.
> > > > >>> >>
> > > > >>> >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <
> > > ivandasch@gmail.com
> > > > >:
> > > > >>> >>
> > > > >>> >> > > Any existing library that can be used to avoid
> > re-implementing
> > > > the
> > > > >>> >> > protocol ourselves? Perhaps, porting the existing
> > implementation
> > > > to
> > > > >>> Java
> > > > >>> >> > Personally, I like this idea. Go libraries (either raft
> module
> > > of
> > > > >>> etcd
> > > > >>> >> or
> > > > >>> >> > serf by Hashicorp) are famous for clean code, good design,
> > > > >>> stability,
> > > > >>> >> not
> > > > >>> >> > enormous size.
> > > > >>> >> > But, on other side, Go has different model for concurrency
> and
> > > > >>> porting
> > > > >>> >> > probably will not be so straightforward.
> > > > >>> >> >
> > > > >>> >> >
> > > > >>> >> >
> > > > >>> >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <
> > > > ivandasch@gmail.com
> > > > >>> >:
> > > > >>> >> >
> > > > >>> >> > > I'd suggest to discuss this IEP and technical details in
> > open
> > > > ZOOM
> > > > >>> >> > > meeting.
> > > > >>> >> > >
> > > > >>> >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <
> > > > >>> ivandasch@gmail.com>:
> > > > >>> >> > >
> > > > >>> >> > >>
> > > > >>> >> > >>
> > > > >>> >> > >> ---------- Forwarded message ---------
> > > > >>> >> > >> От: Ivan Daschinsky <iv...@gmail.com>
> > > > >>> >> > >> Date: чт, 19 нояб. 2020 г. в 13:02
> > > > >>> >> > >> Subject: Re: IEP-61 Technical discussion
> > > > >>> >> > >> To: Alexey Goncharuk <al...@gmail.com>
> > > > >>> >> > >>
> > > > >>> >> > >>
> > > > >>> >> > >> Alexey, let's arise another question. Specifically, how
> > nodes
> > > > >>> >> initially
> > > > >>> >> > >> find each other (discovery) and how they detect failures.
> > > > >>> >> > >>
> > > > >>> >> > >> I suppose, that gossip protocol is an ideal candidate.
> For
> > > > >>> example,
> > > > >>> >> > >> consul [1] uses this approach, using serf [2] library to
> > > > discover
> > > > >>> >> > members
> > > > >>> >> > >> of cluster.
> > > > >>> >> > >> Then consul forms raft ensemble (server nodes) and client
> > use
> > > > >>> raft
> > > > >>> >> > >> ensemble only as lock service.
> > > > >>> >> > >>
> > > > >>> >> > >> PacificA suggests internal heartbeats mechanism for
> failure
> > > > >>> >> detection of
> > > > >>> >> > >> replicated group, but it says nothing about initial
> > discovery
> > > > of
> > > > >>> >> nodes.
> > > > >>> >> > >>
> > > > >>> >> > >> WDYT?
> > > > >>> >> > >>
> > > > >>> >> > >> [1] -- https://www.consul.io/docs/architecture/gossip
> > > > >>> >> > >> [2] -- https://www.serf.io/
> > > > >>> >> > >>
> > > > >>> >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
> > > > >>> >> > >> alexey.goncharuk@gmail.com>:
> > > > >>> >> > >>
> > > > >>> >> > >>> Following up the Ignite 3.0 scope/development approach
> > > > threads,
> > > > >>> >> this is
> > > > >>> >> > >>> a separate thread to discuss technical aspects of the
> IEP.
> > > > >>> >> > >>>
> > > > >>> >> > >>> Let's reiterate one more time on the questions raised by
> > > Ivan
> > > > >>> and
> > > > >>> >> also
> > > > >>> >> > >>> see if there are any other thoughts on the IEP:
> > > > >>> >> > >>>
> > > > >>> >> > >>>    - *Whether to deploy metastorage on a separate subset
> > of
> > > > the
> > > > >>> >> nodes
> > > > >>> >> > >>>    or allow Ignite to choose these nodes
> automatically.* I
> > > > >>> think it
> > > > >>> >> is
> > > > >>> >> > >>>    feasible to maintain both modes: by default, Ignite
> > will
> > > > >>> choose
> > > > >>> >> > >>>    metastorage nodes automatically which essentially
> will
> > > > >>> provide
> > > > >>> >> the
> > > > >>> >> > same
> > > > >>> >> > >>>    seamless user experience as TCP discovery SPI - no
> > > separate
> > > > >>> >> roles,
> > > > >>> >> > >>>    simplistic deployment. For deployments where people
> > want
> > > to
> > > > >>> have
> > > > >>> >> > more
> > > > >>> >> > >>>    fine-grained control over the nodes' assignments, we
> > will
> > > > >>> >> provide a
> > > > >>> >> > runtime
> > > > >>> >> > >>>    configuration which will allow pinning metastorage
> > group
> > > to
> > > > >>> >> certain
> > > > >>> >> > nodes,
> > > > >>> >> > >>>    thus eliminating the latency concerns.
> > > > >>> >> > >>>    - *Whether there are any TLA+ specs for the PacificA
> > > > >>> protocol.*
> > > > >>> >> Not
> > > > >>> >> > >>>    to my knowledge, but it is known to be used in
> > production
> > > > by
> > > > >>> >> > Microsoft and
> > > > >>> >> > >>>    other projects, e.g. [1]
> > > > >>> >> > >>>
> > > > >>> >> > >>> I would like to collect general feedback on the IEP, as
> > well
> > > > as
> > > > >>> >> > feedback
> > > > >>> >> > >>> on specific parts of it, such as:
> > > > >>> >> > >>>
> > > > >>> >> > >>>    - Metastorage API
> > > > >>> >> > >>>    - Any existing library that can be used to avoid
> > > > >>> re-implementing
> > > > >>> >> the
> > > > >>> >> > >>>    protocol ourselves? Perhaps, porting the existing
> > > > >>> implementation
> > > > >>> >> to
> > > > >>> >> > Java
> > > > >>> >> > >>>    (the way TiKV did with etcd-raft [2] [3]? This is a
> > very
> > > > >>> neat way
> > > > >>> >> > btw in my
> > > > >>> >> > >>>    opinion because I like the finite automata-like
> > approach
> > > of
> > > > >>> the
> > > > >>> >> > replication
> > > > >>> >> > >>>    module, and, additionally, we could sync bug fixes
> and
> > > > >>> >> improvements
> > > > >>> >> > from
> > > > >>> >> > >>>    the upstream project)
> > > > >>> >> > >>>
> > > > >>> >> > >>>
> > > > >>> >> > >>> Thanks,
> > > > >>> >> > >>> --AG
> > > > >>> >> > >>>
> > > > >>> >> > >>> [1]
> > > > >>> >> > >>>
> > > > >>> >>
> > > >
> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
> > > > >>> >> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
> > > > >>> >> > >>> [3] https://github.com/tikv/raft-rs
> > > > >>> >> > >>>
> > > > >>> >> > >>
> > > > >>> >> > >>
> > > > >>> >> > >> --
> > > > >>> >> > >> Sincerely yours, Ivan Daschinskiy
> > > > >>> >> > >>
> > > > >>> >> > >>
> > > > >>> >> > >> --
> > > > >>> >> > >> Sincerely yours, Ivan Daschinskiy
> > > > >>> >> > >>
> > > > >>> >> > >
> > > > >>> >> > >
> > > > >>> >> > > --
> > > > >>> >> > > Sincerely yours, Ivan Daschinskiy
> > > > >>> >> > >
> > > > >>> >> >
> > > > >>> >> >
> > > > >>> >> > --
> > > > >>> >> > Sincerely yours, Ivan Daschinskiy
> > > > >>> >> >
> > > > >>> >>
> > > > >>> >
> > > > >>> >
> > > > >>> > --
> > > > >>> > Sincerely yours, Ivan Daschinskiy
> > > > >>> >
> > > > >>>
> > > > >>>
> > > > >>> --
> > > > >>> Sincerely yours, Ivan Daschinskiy
> > > > >>>
> > > > >>
> > > >
> > >
> > >
> > > --
> > > Sincerely yours, Ivan Daschinskiy
> > >
> >
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>

Re: IEP-61 Technical discussion

Posted by Ivan Daschinsky <iv...@gmail.com>.

Alexey, is it possible to manage call at 16:00 MSK?

чт, 26 нояб. 2020 г. в 12:30, Alexey Goncharuk <al...@gmail.com>:

> Hi Ivan,
>
> Unfortunately, the earliest window available for us is 12:00 MSK (1 hour
> slot), or after 14:30 MSK. Let me know what time works best for you.
>
> ср, 25 нояб. 2020 г. в 21:38, Ivan Daschinsky <iv...@gmail.com>:
>
> > Alexey, I kindly ask you to move the meeting a little bit earlier, ideal
> > variant -- in the morning.
> >
> > ср, 25 нояб. 2020 г. в 20:10, Alexey Goncharuk <
> alexey.goncharuk@gmail.com
> > >:
> >
> > > Folks, let's have the call on Friday, Nov 27th at 18:00 MSK? We can use
> > the
> > > following waiting room link:
> > >  https://zoom.us/j/99450012496?pwd=RWZmOGhCNWlRK0ZpamdOOTZsYTJ0dz09
> > >
> > > Let me know if this time works for everybody.
> > >
> > > ср, 25 нояб. 2020 г. в 16:42, Alexey Goncharuk <
> > alexey.goncharuk@gmail.com
> > > >:
> > >
> > > > Folks,
> > > >
> > > > I've made some edits in IEP-61 [1] regarding the group membership
> > service
> > > > and transaction protocol interaction with the replication
> > infrastructure,
> > > > please take a look before our Friday call.
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-61%3A+Common+Replication+Infrastructure
> > > >
> > > > пн, 23 нояб. 2020 г. в 13:28, Alexey Goncharuk <
> > > alexey.goncharuk@gmail.com
> > > > >:
> > > >
> > > >> Thanks, Ivan,
> > > >>
> > > >> Another protocol for group membership worth checking out is RAPID
> [1]
> > (a
> > > >> recent one). Not sure though if there are any available
> > implementations
> > > for
> > > >> it already.
> > > >>
> > > >> [1]
> > > https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf
> > > >>
> > > >> пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <ivandasch@gmail.com
> >:
> > > >>
> > > >>> Also, here is some interesting reading about gossip, SWIM etc.
> > > >>>
> > > >>> 1 --
> > > >>>
> > http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
> > > >>> 2 --
> > > >>>
> > > >>>
> > >
> >
> http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
> > > >>> 3 -- https://github.com/hashicorp/memberlist (Foundation library
> of
> > > >>> hashicorp serf)
> > > >>> 4 -- https://github.com/scalecube/scalecube-cluster -- (Java
> > > >>> implementation
> > > >>> of SWIM)
> > > >>>
> > > >>> чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <ivandasch@gmail.com
> >:
> > > >>>
> > > >>> > >> Friday, Nov 27th work for you? If ok, let's have an open call
> > > then.
> > > >>> > Yes, great
> > > >>> > >> As for the protocol port - we will not be dealing with the
> > > >>> > concurrency...
> > > >>> > >>Judging by the Rust port, it seems fairly straightforward.
> > > >>> > Yes, they chose split transport and logic. But original Go
> package
> > > from
> > > >>> > etcd (see raft/node.go) contains some  heartbeats mechanism etc.
> > > >>> > I agree with you, this seems not to be a huge deal to port.
> > > >>> >
> > > >>> > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <
> > > >>> alexey.goncharuk@gmail.com
> > > >>> > >:
> > > >>> >
> > > >>> >> Ivan,
> > > >>> >>
> > > >>> >> Agree, let's have a call to discuss the IEP. I have some more
> > > thoughts
> > > >>> >> regarding how the replication infrastructure works with
> > > >>> >> atomic/transactional caches, will put this info to the IEP. Does
> > > next
> > > >>> >> Friday, Nov 27th work for you? If ok, let's have an open call
> > then.
> > > >>> >>
> > > >>> >> As for the protocol port - we will not be dealing with the
> > > concurrency
> > > >>> >> model if we choose this way, this is what I like about their
> code
> > > >>> >> structure. Essentially, the raft module is a single-threaded
> > > automata
> > > >>> >> which
> > > >>> >> has a callback to process a message, process a tick (timeout)
> and
> > > >>> produces
> > > >>> >> messages that should be sent and log entries that should be
> > > persisted.
> > > >>> >> Judging by the Rust port, it seems fairly straightforward. Will
> be
> > > >>> happy
> > > >>> >> to
> > > >>> >> discuss this and other alternatives on the call as well.
> > > >>> >>
> > > >>> >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <
> > ivandasch@gmail.com
> > > >:
> > > >>> >>
> > > >>> >> > > Any existing library that can be used to avoid
> re-implementing
> > > the
> > > >>> >> > protocol ourselves? Perhaps, porting the existing
> implementation
> > > to
> > > >>> Java
> > > >>> >> > Personally, I like this idea. Go libraries (either raft module
> > of
> > > >>> etcd
> > > >>> >> or
> > > >>> >> > serf by Hashicorp) are famous for clean code, good design,
> > > >>> stability,
> > > >>> >> not
> > > >>> >> > enormous size.
> > > >>> >> > But, on other side, Go has different model for concurrency and
> > > >>> porting
> > > >>> >> > probably will not be so straightforward.
> > > >>> >> >
> > > >>> >> >
> > > >>> >> >
> > > >>> >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <
> > > ivandasch@gmail.com
> > > >>> >:
> > > >>> >> >
> > > >>> >> > > I'd suggest to discuss this IEP and technical details in
> open
> > > ZOOM
> > > >>> >> > > meeting.
> > > >>> >> > >
> > > >>> >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <
> > > >>> ivandasch@gmail.com>:
> > > >>> >> > >
> > > >>> >> > >>
> > > >>> >> > >>
> > > >>> >> > >> ---------- Forwarded message ---------
> > > >>> >> > >> От: Ivan Daschinsky <iv...@gmail.com>
> > > >>> >> > >> Date: чт, 19 нояб. 2020 г. в 13:02
> > > >>> >> > >> Subject: Re: IEP-61 Technical discussion
> > > >>> >> > >> To: Alexey Goncharuk <al...@gmail.com>
> > > >>> >> > >>
> > > >>> >> > >>
> > > >>> >> > >> Alexey, let's arise another question. Specifically, how
> nodes
> > > >>> >> initially
> > > >>> >> > >> find each other (discovery) and how they detect failures.
> > > >>> >> > >>
> > > >>> >> > >> I suppose, that gossip protocol is an ideal candidate. For
> > > >>> example,
> > > >>> >> > >> consul [1] uses this approach, using serf [2] library to
> > > discover
> > > >>> >> > members
> > > >>> >> > >> of cluster.
> > > >>> >> > >> Then consul forms raft ensemble (server nodes) and client
> use
> > > >>> raft
> > > >>> >> > >> ensemble only as lock service.
> > > >>> >> > >>
> > > >>> >> > >> PacificA suggests internal heartbeats mechanism for failure
> > > >>> >> detection of
> > > >>> >> > >> replicated group, but it says nothing about initial
> discovery
> > > of
> > > >>> >> nodes.
> > > >>> >> > >>
> > > >>> >> > >> WDYT?
> > > >>> >> > >>
> > > >>> >> > >> [1] -- https://www.consul.io/docs/architecture/gossip
> > > >>> >> > >> [2] -- https://www.serf.io/
> > > >>> >> > >>
> > > >>> >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
> > > >>> >> > >> alexey.goncharuk@gmail.com>:
> > > >>> >> > >>
> > > >>> >> > >>> Following up the Ignite 3.0 scope/development approach
> > > threads,
> > > >>> >> this is
> > > >>> >> > >>> a separate thread to discuss technical aspects of the IEP.
> > > >>> >> > >>>
> > > >>> >> > >>> Let's reiterate one more time on the questions raised by
> > Ivan
> > > >>> and
> > > >>> >> also
> > > >>> >> > >>> see if there are any other thoughts on the IEP:
> > > >>> >> > >>>
> > > >>> >> > >>>    - *Whether to deploy metastorage on a separate subset
> of
> > > the
> > > >>> >> nodes
> > > >>> >> > >>>    or allow Ignite to choose these nodes automatically.* I
> > > >>> think it
> > > >>> >> is
> > > >>> >> > >>>    feasible to maintain both modes: by default, Ignite
> will
> > > >>> choose
> > > >>> >> > >>>    metastorage nodes automatically which essentially will
> > > >>> provide
> > > >>> >> the
> > > >>> >> > same
> > > >>> >> > >>>    seamless user experience as TCP discovery SPI - no
> > separate
> > > >>> >> roles,
> > > >>> >> > >>>    simplistic deployment. For deployments where people
> want
> > to
> > > >>> have
> > > >>> >> > more
> > > >>> >> > >>>    fine-grained control over the nodes' assignments, we
> will
> > > >>> >> provide a
> > > >>> >> > runtime
> > > >>> >> > >>>    configuration which will allow pinning metastorage
> group
> > to
> > > >>> >> certain
> > > >>> >> > nodes,
> > > >>> >> > >>>    thus eliminating the latency concerns.
> > > >>> >> > >>>    - *Whether there are any TLA+ specs for the PacificA
> > > >>> protocol.*
> > > >>> >> Not
> > > >>> >> > >>>    to my knowledge, but it is known to be used in
> production
> > > by
> > > >>> >> > Microsoft and
> > > >>> >> > >>>    other projects, e.g. [1]
> > > >>> >> > >>>
> > > >>> >> > >>> I would like to collect general feedback on the IEP, as
> well
> > > as
> > > >>> >> > feedback
> > > >>> >> > >>> on specific parts of it, such as:
> > > >>> >> > >>>
> > > >>> >> > >>>    - Metastorage API
> > > >>> >> > >>>    - Any existing library that can be used to avoid
> > > >>> re-implementing
> > > >>> >> the
> > > >>> >> > >>>    protocol ourselves? Perhaps, porting the existing
> > > >>> implementation
> > > >>> >> to
> > > >>> >> > Java
> > > >>> >> > >>>    (the way TiKV did with etcd-raft [2] [3]? This is a
> very
> > > >>> neat way
> > > >>> >> > btw in my
> > > >>> >> > >>>    opinion because I like the finite automata-like
> approach
> > of
> > > >>> the
> > > >>> >> > replication
> > > >>> >> > >>>    module, and, additionally, we could sync bug fixes and
> > > >>> >> improvements
> > > >>> >> > from
> > > >>> >> > >>>    the upstream project)
> > > >>> >> > >>>
> > > >>> >> > >>>
> > > >>> >> > >>> Thanks,
> > > >>> >> > >>> --AG
> > > >>> >> > >>>
> > > >>> >> > >>> [1]
> > > >>> >> > >>>
> > > >>> >>
> > > https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
> > > >>> >> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
> > > >>> >> > >>> [3] https://github.com/tikv/raft-rs
> > > >>> >> > >>>
> > > >>> >> > >>
> > > >>> >> > >>
> > > >>> >> > >> --
> > > >>> >> > >> Sincerely yours, Ivan Daschinskiy
> > > >>> >> > >>
> > > >>> >> > >>
> > > >>> >> > >> --
> > > >>> >> > >> Sincerely yours, Ivan Daschinskiy
> > > >>> >> > >>
> > > >>> >> > >
> > > >>> >> > >
> > > >>> >> > > --
> > > >>> >> > > Sincerely yours, Ivan Daschinskiy
> > > >>> >> > >
> > > >>> >> >
> > > >>> >> >
> > > >>> >> > --
> > > >>> >> > Sincerely yours, Ivan Daschinskiy
> > > >>> >> >
> > > >>> >>
> > > >>> >
> > > >>> >
> > > >>> > --
> > > >>> > Sincerely yours, Ivan Daschinskiy
> > > >>> >
> > > >>>
> > > >>>
> > > >>> --
> > > >>> Sincerely yours, Ivan Daschinskiy
> > > >>>
> > > >>
> > >
> >
> >
> > --
> > Sincerely yours, Ivan Daschinskiy
> >
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: IEP-61 Technical discussion

Posted by Alexey Goncharuk <al...@gmail.com>.

Hi Ivan,

Unfortunately, the earliest window available for us is 12:00 MSK (1 hour
slot), or after 14:30 MSK. Let me know what time works best for you.

ср, 25 нояб. 2020 г. в 21:38, Ivan Daschinsky <iv...@gmail.com>:

> Alexey, I kindly ask you to move the meeting a little bit earlier, ideal
> variant -- in the morning.
>
> ср, 25 нояб. 2020 г. в 20:10, Alexey Goncharuk <alexey.goncharuk@gmail.com
> >:
>
> > Folks, let's have the call on Friday, Nov 27th at 18:00 MSK? We can use
> the
> > following waiting room link:
> >  https://zoom.us/j/99450012496?pwd=RWZmOGhCNWlRK0ZpamdOOTZsYTJ0dz09
> >
> > Let me know if this time works for everybody.
> >
> > ср, 25 нояб. 2020 г. в 16:42, Alexey Goncharuk <
> alexey.goncharuk@gmail.com
> > >:
> >
> > > Folks,
> > >
> > > I've made some edits in IEP-61 [1] regarding the group membership
> service
> > > and transaction protocol interaction with the replication
> infrastructure,
> > > please take a look before our Friday call.
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-61%3A+Common+Replication+Infrastructure
> > >
> > > пн, 23 нояб. 2020 г. в 13:28, Alexey Goncharuk <
> > alexey.goncharuk@gmail.com
> > > >:
> > >
> > >> Thanks, Ivan,
> > >>
> > >> Another protocol for group membership worth checking out is RAPID [1]
> (a
> > >> recent one). Not sure though if there are any available
> implementations
> > for
> > >> it already.
> > >>
> > >> [1]
> > https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf
> > >>
> > >> пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <iv...@gmail.com>:
> > >>
> > >>> Also, here is some interesting reading about gossip, SWIM etc.
> > >>>
> > >>> 1 --
> > >>>
> http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
> > >>> 2 --
> > >>>
> > >>>
> >
> http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
> > >>> 3 -- https://github.com/hashicorp/memberlist (Foundation library of
> > >>> hashicorp serf)
> > >>> 4 -- https://github.com/scalecube/scalecube-cluster -- (Java
> > >>> implementation
> > >>> of SWIM)
> > >>>
> > >>> чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <iv...@gmail.com>:
> > >>>
> > >>> > >> Friday, Nov 27th work for you? If ok, let's have an open call
> > then.
> > >>> > Yes, great
> > >>> > >> As for the protocol port - we will not be dealing with the
> > >>> > concurrency...
> > >>> > >>Judging by the Rust port, it seems fairly straightforward.
> > >>> > Yes, they chose split transport and logic. But original Go package
> > from
> > >>> > etcd (see raft/node.go) contains some  heartbeats mechanism etc.
> > >>> > I agree with you, this seems not to be a huge deal to port.
> > >>> >
> > >>> > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <
> > >>> alexey.goncharuk@gmail.com
> > >>> > >:
> > >>> >
> > >>> >> Ivan,
> > >>> >>
> > >>> >> Agree, let's have a call to discuss the IEP. I have some more
> > thoughts
> > >>> >> regarding how the replication infrastructure works with
> > >>> >> atomic/transactional caches, will put this info to the IEP. Does
> > next
> > >>> >> Friday, Nov 27th work for you? If ok, let's have an open call
> then.
> > >>> >>
> > >>> >> As for the protocol port - we will not be dealing with the
> > concurrency
> > >>> >> model if we choose this way, this is what I like about their code
> > >>> >> structure. Essentially, the raft module is a single-threaded
> > automata
> > >>> >> which
> > >>> >> has a callback to process a message, process a tick (timeout) and
> > >>> produces
> > >>> >> messages that should be sent and log entries that should be
> > persisted.
> > >>> >> Judging by the Rust port, it seems fairly straightforward. Will be
> > >>> happy
> > >>> >> to
> > >>> >> discuss this and other alternatives on the call as well.
> > >>> >>
> > >>> >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <
> ivandasch@gmail.com
> > >:
> > >>> >>
> > >>> >> > > Any existing library that can be used to avoid re-implementing
> > the
> > >>> >> > protocol ourselves? Perhaps, porting the existing implementation
> > to
> > >>> Java
> > >>> >> > Personally, I like this idea. Go libraries (either raft module
> of
> > >>> etcd
> > >>> >> or
> > >>> >> > serf by Hashicorp) are famous for clean code, good design,
> > >>> stability,
> > >>> >> not
> > >>> >> > enormous size.
> > >>> >> > But, on other side, Go has different model for concurrency and
> > >>> porting
> > >>> >> > probably will not be so straightforward.
> > >>> >> >
> > >>> >> >
> > >>> >> >
> > >>> >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <
> > ivandasch@gmail.com
> > >>> >:
> > >>> >> >
> > >>> >> > > I'd suggest to discuss this IEP and technical details in open
> > ZOOM
> > >>> >> > > meeting.
> > >>> >> > >
> > >>> >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <
> > >>> ivandasch@gmail.com>:
> > >>> >> > >
> > >>> >> > >>
> > >>> >> > >>
> > >>> >> > >> ---------- Forwarded message ---------
> > >>> >> > >> От: Ivan Daschinsky <iv...@gmail.com>
> > >>> >> > >> Date: чт, 19 нояб. 2020 г. в 13:02
> > >>> >> > >> Subject: Re: IEP-61 Technical discussion
> > >>> >> > >> To: Alexey Goncharuk <al...@gmail.com>
> > >>> >> > >>
> > >>> >> > >>
> > >>> >> > >> Alexey, let's arise another question. Specifically, how nodes
> > >>> >> initially
> > >>> >> > >> find each other (discovery) and how they detect failures.
> > >>> >> > >>
> > >>> >> > >> I suppose, that gossip protocol is an ideal candidate. For
> > >>> example,
> > >>> >> > >> consul [1] uses this approach, using serf [2] library to
> > discover
> > >>> >> > members
> > >>> >> > >> of cluster.
> > >>> >> > >> Then consul forms raft ensemble (server nodes) and client use
> > >>> raft
> > >>> >> > >> ensemble only as lock service.
> > >>> >> > >>
> > >>> >> > >> PacificA suggests internal heartbeats mechanism for failure
> > >>> >> detection of
> > >>> >> > >> replicated group, but it says nothing about initial discovery
> > of
> > >>> >> nodes.
> > >>> >> > >>
> > >>> >> > >> WDYT?
> > >>> >> > >>
> > >>> >> > >> [1] -- https://www.consul.io/docs/architecture/gossip
> > >>> >> > >> [2] -- https://www.serf.io/
> > >>> >> > >>
> > >>> >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
> > >>> >> > >> alexey.goncharuk@gmail.com>:
> > >>> >> > >>
> > >>> >> > >>> Following up the Ignite 3.0 scope/development approach
> > threads,
> > >>> >> this is
> > >>> >> > >>> a separate thread to discuss technical aspects of the IEP.
> > >>> >> > >>>
> > >>> >> > >>> Let's reiterate one more time on the questions raised by
> Ivan
> > >>> and
> > >>> >> also
> > >>> >> > >>> see if there are any other thoughts on the IEP:
> > >>> >> > >>>
> > >>> >> > >>>    - *Whether to deploy metastorage on a separate subset of
> > the
> > >>> >> nodes
> > >>> >> > >>>    or allow Ignite to choose these nodes automatically.* I
> > >>> think it
> > >>> >> is
> > >>> >> > >>>    feasible to maintain both modes: by default, Ignite will
> > >>> choose
> > >>> >> > >>>    metastorage nodes automatically which essentially will
> > >>> provide
> > >>> >> the
> > >>> >> > same
> > >>> >> > >>>    seamless user experience as TCP discovery SPI - no
> separate
> > >>> >> roles,
> > >>> >> > >>>    simplistic deployment. For deployments where people want
> to
> > >>> have
> > >>> >> > more
> > >>> >> > >>>    fine-grained control over the nodes' assignments, we will
> > >>> >> provide a
> > >>> >> > runtime
> > >>> >> > >>>    configuration which will allow pinning metastorage group
> to
> > >>> >> certain
> > >>> >> > nodes,
> > >>> >> > >>>    thus eliminating the latency concerns.
> > >>> >> > >>>    - *Whether there are any TLA+ specs for the PacificA
> > >>> protocol.*
> > >>> >> Not
> > >>> >> > >>>    to my knowledge, but it is known to be used in production
> > by
> > >>> >> > Microsoft and
> > >>> >> > >>>    other projects, e.g. [1]
> > >>> >> > >>>
> > >>> >> > >>> I would like to collect general feedback on the IEP, as well
> > as
> > >>> >> > feedback
> > >>> >> > >>> on specific parts of it, such as:
> > >>> >> > >>>
> > >>> >> > >>>    - Metastorage API
> > >>> >> > >>>    - Any existing library that can be used to avoid
> > >>> re-implementing
> > >>> >> the
> > >>> >> > >>>    protocol ourselves? Perhaps, porting the existing
> > >>> implementation
> > >>> >> to
> > >>> >> > Java
> > >>> >> > >>>    (the way TiKV did with etcd-raft [2] [3]? This is a very
> > >>> neat way
> > >>> >> > btw in my
> > >>> >> > >>>    opinion because I like the finite automata-like approach
> of
> > >>> the
> > >>> >> > replication
> > >>> >> > >>>    module, and, additionally, we could sync bug fixes and
> > >>> >> improvements
> > >>> >> > from
> > >>> >> > >>>    the upstream project)
> > >>> >> > >>>
> > >>> >> > >>>
> > >>> >> > >>> Thanks,
> > >>> >> > >>> --AG
> > >>> >> > >>>
> > >>> >> > >>> [1]
> > >>> >> > >>>
> > >>> >>
> > https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
> > >>> >> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
> > >>> >> > >>> [3] https://github.com/tikv/raft-rs
> > >>> >> > >>>
> > >>> >> > >>
> > >>> >> > >>
> > >>> >> > >> --
> > >>> >> > >> Sincerely yours, Ivan Daschinskiy
> > >>> >> > >>
> > >>> >> > >>
> > >>> >> > >> --
> > >>> >> > >> Sincerely yours, Ivan Daschinskiy
> > >>> >> > >>
> > >>> >> > >
> > >>> >> > >
> > >>> >> > > --
> > >>> >> > > Sincerely yours, Ivan Daschinskiy
> > >>> >> > >
> > >>> >> >
> > >>> >> >
> > >>> >> > --
> > >>> >> > Sincerely yours, Ivan Daschinskiy
> > >>> >> >
> > >>> >>
> > >>> >
> > >>> >
> > >>> > --
> > >>> > Sincerely yours, Ivan Daschinskiy
> > >>> >
> > >>>
> > >>>
> > >>> --
> > >>> Sincerely yours, Ivan Daschinskiy
> > >>>
> > >>
> >
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>

Re: IEP-61 Technical discussion

Posted by Ivan Daschinsky <iv...@gmail.com>.

Alexey, I kindly ask you to move the meeting a little bit earlier, ideal
variant -- in the morning.

ср, 25 нояб. 2020 г. в 20:10, Alexey Goncharuk <al...@gmail.com>:

> Folks, let's have the call on Friday, Nov 27th at 18:00 MSK? We can use the
> following waiting room link:
>  https://zoom.us/j/99450012496?pwd=RWZmOGhCNWlRK0ZpamdOOTZsYTJ0dz09
>
> Let me know if this time works for everybody.
>
> ср, 25 нояб. 2020 г. в 16:42, Alexey Goncharuk <alexey.goncharuk@gmail.com
> >:
>
> > Folks,
> >
> > I've made some edits in IEP-61 [1] regarding the group membership service
> > and transaction protocol interaction with the replication infrastructure,
> > please take a look before our Friday call.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-61%3A+Common+Replication+Infrastructure
> >
> > пн, 23 нояб. 2020 г. в 13:28, Alexey Goncharuk <
> alexey.goncharuk@gmail.com
> > >:
> >
> >> Thanks, Ivan,
> >>
> >> Another protocol for group membership worth checking out is RAPID [1] (a
> >> recent one). Not sure though if there are any available implementations
> for
> >> it already.
> >>
> >> [1]
> https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf
> >>
> >> пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <iv...@gmail.com>:
> >>
> >>> Also, here is some interesting reading about gossip, SWIM etc.
> >>>
> >>> 1 --
> >>> http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
> >>> 2 --
> >>>
> >>>
> http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
> >>> 3 -- https://github.com/hashicorp/memberlist (Foundation library of
> >>> hashicorp serf)
> >>> 4 -- https://github.com/scalecube/scalecube-cluster -- (Java
> >>> implementation
> >>> of SWIM)
> >>>
> >>> чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <iv...@gmail.com>:
> >>>
> >>> > >> Friday, Nov 27th work for you? If ok, let's have an open call
> then.
> >>> > Yes, great
> >>> > >> As for the protocol port - we will not be dealing with the
> >>> > concurrency...
> >>> > >>Judging by the Rust port, it seems fairly straightforward.
> >>> > Yes, they chose split transport and logic. But original Go package
> from
> >>> > etcd (see raft/node.go) contains some  heartbeats mechanism etc.
> >>> > I agree with you, this seems not to be a huge deal to port.
> >>> >
> >>> > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <
> >>> alexey.goncharuk@gmail.com
> >>> > >:
> >>> >
> >>> >> Ivan,
> >>> >>
> >>> >> Agree, let's have a call to discuss the IEP. I have some more
> thoughts
> >>> >> regarding how the replication infrastructure works with
> >>> >> atomic/transactional caches, will put this info to the IEP. Does
> next
> >>> >> Friday, Nov 27th work for you? If ok, let's have an open call then.
> >>> >>
> >>> >> As for the protocol port - we will not be dealing with the
> concurrency
> >>> >> model if we choose this way, this is what I like about their code
> >>> >> structure. Essentially, the raft module is a single-threaded
> automata
> >>> >> which
> >>> >> has a callback to process a message, process a tick (timeout) and
> >>> produces
> >>> >> messages that should be sent and log entries that should be
> persisted.
> >>> >> Judging by the Rust port, it seems fairly straightforward. Will be
> >>> happy
> >>> >> to
> >>> >> discuss this and other alternatives on the call as well.
> >>> >>
> >>> >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <ivandasch@gmail.com
> >:
> >>> >>
> >>> >> > > Any existing library that can be used to avoid re-implementing
> the
> >>> >> > protocol ourselves? Perhaps, porting the existing implementation
> to
> >>> Java
> >>> >> > Personally, I like this idea. Go libraries (either raft module of
> >>> etcd
> >>> >> or
> >>> >> > serf by Hashicorp) are famous for clean code, good design,
> >>> stability,
> >>> >> not
> >>> >> > enormous size.
> >>> >> > But, on other side, Go has different model for concurrency and
> >>> porting
> >>> >> > probably will not be so straightforward.
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <
> ivandasch@gmail.com
> >>> >:
> >>> >> >
> >>> >> > > I'd suggest to discuss this IEP and technical details in open
> ZOOM
> >>> >> > > meeting.
> >>> >> > >
> >>> >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <
> >>> ivandasch@gmail.com>:
> >>> >> > >
> >>> >> > >>
> >>> >> > >>
> >>> >> > >> ---------- Forwarded message ---------
> >>> >> > >> От: Ivan Daschinsky <iv...@gmail.com>
> >>> >> > >> Date: чт, 19 нояб. 2020 г. в 13:02
> >>> >> > >> Subject: Re: IEP-61 Technical discussion
> >>> >> > >> To: Alexey Goncharuk <al...@gmail.com>
> >>> >> > >>
> >>> >> > >>
> >>> >> > >> Alexey, let's arise another question. Specifically, how nodes
> >>> >> initially
> >>> >> > >> find each other (discovery) and how they detect failures.
> >>> >> > >>
> >>> >> > >> I suppose, that gossip protocol is an ideal candidate. For
> >>> example,
> >>> >> > >> consul [1] uses this approach, using serf [2] library to
> discover
> >>> >> > members
> >>> >> > >> of cluster.
> >>> >> > >> Then consul forms raft ensemble (server nodes) and client use
> >>> raft
> >>> >> > >> ensemble only as lock service.
> >>> >> > >>
> >>> >> > >> PacificA suggests internal heartbeats mechanism for failure
> >>> >> detection of
> >>> >> > >> replicated group, but it says nothing about initial discovery
> of
> >>> >> nodes.
> >>> >> > >>
> >>> >> > >> WDYT?
> >>> >> > >>
> >>> >> > >> [1] -- https://www.consul.io/docs/architecture/gossip
> >>> >> > >> [2] -- https://www.serf.io/
> >>> >> > >>
> >>> >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
> >>> >> > >> alexey.goncharuk@gmail.com>:
> >>> >> > >>
> >>> >> > >>> Following up the Ignite 3.0 scope/development approach
> threads,
> >>> >> this is
> >>> >> > >>> a separate thread to discuss technical aspects of the IEP.
> >>> >> > >>>
> >>> >> > >>> Let's reiterate one more time on the questions raised by Ivan
> >>> and
> >>> >> also
> >>> >> > >>> see if there are any other thoughts on the IEP:
> >>> >> > >>>
> >>> >> > >>>    - *Whether to deploy metastorage on a separate subset of
> the
> >>> >> nodes
> >>> >> > >>>    or allow Ignite to choose these nodes automatically.* I
> >>> think it
> >>> >> is
> >>> >> > >>>    feasible to maintain both modes: by default, Ignite will
> >>> choose
> >>> >> > >>>    metastorage nodes automatically which essentially will
> >>> provide
> >>> >> the
> >>> >> > same
> >>> >> > >>>    seamless user experience as TCP discovery SPI - no separate
> >>> >> roles,
> >>> >> > >>>    simplistic deployment. For deployments where people want to
> >>> have
> >>> >> > more
> >>> >> > >>>    fine-grained control over the nodes' assignments, we will
> >>> >> provide a
> >>> >> > runtime
> >>> >> > >>>    configuration which will allow pinning metastorage group to
> >>> >> certain
> >>> >> > nodes,
> >>> >> > >>>    thus eliminating the latency concerns.
> >>> >> > >>>    - *Whether there are any TLA+ specs for the PacificA
> >>> protocol.*
> >>> >> Not
> >>> >> > >>>    to my knowledge, but it is known to be used in production
> by
> >>> >> > Microsoft and
> >>> >> > >>>    other projects, e.g. [1]
> >>> >> > >>>
> >>> >> > >>> I would like to collect general feedback on the IEP, as well
> as
> >>> >> > feedback
> >>> >> > >>> on specific parts of it, such as:
> >>> >> > >>>
> >>> >> > >>>    - Metastorage API
> >>> >> > >>>    - Any existing library that can be used to avoid
> >>> re-implementing
> >>> >> the
> >>> >> > >>>    protocol ourselves? Perhaps, porting the existing
> >>> implementation
> >>> >> to
> >>> >> > Java
> >>> >> > >>>    (the way TiKV did with etcd-raft [2] [3]? This is a very
> >>> neat way
> >>> >> > btw in my
> >>> >> > >>>    opinion because I like the finite automata-like approach of
> >>> the
> >>> >> > replication
> >>> >> > >>>    module, and, additionally, we could sync bug fixes and
> >>> >> improvements
> >>> >> > from
> >>> >> > >>>    the upstream project)
> >>> >> > >>>
> >>> >> > >>>
> >>> >> > >>> Thanks,
> >>> >> > >>> --AG
> >>> >> > >>>
> >>> >> > >>> [1]
> >>> >> > >>>
> >>> >>
> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
> >>> >> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
> >>> >> > >>> [3] https://github.com/tikv/raft-rs
> >>> >> > >>>
> >>> >> > >>
> >>> >> > >>
> >>> >> > >> --
> >>> >> > >> Sincerely yours, Ivan Daschinskiy
> >>> >> > >>
> >>> >> > >>
> >>> >> > >> --
> >>> >> > >> Sincerely yours, Ivan Daschinskiy
> >>> >> > >>
> >>> >> > >
> >>> >> > >
> >>> >> > > --
> >>> >> > > Sincerely yours, Ivan Daschinskiy
> >>> >> > >
> >>> >> >
> >>> >> >
> >>> >> > --
> >>> >> > Sincerely yours, Ivan Daschinskiy
> >>> >> >
> >>> >>
> >>> >
> >>> >
> >>> > --
> >>> > Sincerely yours, Ivan Daschinskiy
> >>> >
> >>>
> >>>
> >>> --
> >>> Sincerely yours, Ivan Daschinskiy
> >>>
> >>
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: IEP-61 Technical discussion

Posted by Alexey Goncharuk <al...@gmail.com>.

Folks, let's have the call on Friday, Nov 27th at 18:00 MSK? We can use the
following waiting room link:
 https://zoom.us/j/99450012496?pwd=RWZmOGhCNWlRK0ZpamdOOTZsYTJ0dz09

Let me know if this time works for everybody.

ср, 25 нояб. 2020 г. в 16:42, Alexey Goncharuk <al...@gmail.com>:

> Folks,
>
> I've made some edits in IEP-61 [1] regarding the group membership service
> and transaction protocol interaction with the replication infrastructure,
> please take a look before our Friday call.
>
> [1]
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-61%3A+Common+Replication+Infrastructure
>
> пн, 23 нояб. 2020 г. в 13:28, Alexey Goncharuk <alexey.goncharuk@gmail.com
> >:
>
>> Thanks, Ivan,
>>
>> Another protocol for group membership worth checking out is RAPID [1] (a
>> recent one). Not sure though if there are any available implementations for
>> it already.
>>
>> [1] https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf
>>
>> пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <iv...@gmail.com>:
>>
>>> Also, here is some interesting reading about gossip, SWIM etc.
>>>
>>> 1 --
>>> http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
>>> 2 --
>>>
>>> http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
>>> 3 -- https://github.com/hashicorp/memberlist (Foundation library of
>>> hashicorp serf)
>>> 4 -- https://github.com/scalecube/scalecube-cluster -- (Java
>>> implementation
>>> of SWIM)
>>>
>>> чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <iv...@gmail.com>:
>>>
>>> > >> Friday, Nov 27th work for you? If ok, let's have an open call then.
>>> > Yes, great
>>> > >> As for the protocol port - we will not be dealing with the
>>> > concurrency...
>>> > >>Judging by the Rust port, it seems fairly straightforward.
>>> > Yes, they chose split transport and logic. But original Go package from
>>> > etcd (see raft/node.go) contains some  heartbeats mechanism etc.
>>> > I agree with you, this seems not to be a huge deal to port.
>>> >
>>> > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <
>>> alexey.goncharuk@gmail.com
>>> > >:
>>> >
>>> >> Ivan,
>>> >>
>>> >> Agree, let's have a call to discuss the IEP. I have some more thoughts
>>> >> regarding how the replication infrastructure works with
>>> >> atomic/transactional caches, will put this info to the IEP. Does next
>>> >> Friday, Nov 27th work for you? If ok, let's have an open call then.
>>> >>
>>> >> As for the protocol port - we will not be dealing with the concurrency
>>> >> model if we choose this way, this is what I like about their code
>>> >> structure. Essentially, the raft module is a single-threaded automata
>>> >> which
>>> >> has a callback to process a message, process a tick (timeout) and
>>> produces
>>> >> messages that should be sent and log entries that should be persisted.
>>> >> Judging by the Rust port, it seems fairly straightforward. Will be
>>> happy
>>> >> to
>>> >> discuss this and other alternatives on the call as well.
>>> >>
>>> >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <iv...@gmail.com>:
>>> >>
>>> >> > > Any existing library that can be used to avoid re-implementing the
>>> >> > protocol ourselves? Perhaps, porting the existing implementation to
>>> Java
>>> >> > Personally, I like this idea. Go libraries (either raft module of
>>> etcd
>>> >> or
>>> >> > serf by Hashicorp) are famous for clean code, good design,
>>> stability,
>>> >> not
>>> >> > enormous size.
>>> >> > But, on other side, Go has different model for concurrency and
>>> porting
>>> >> > probably will not be so straightforward.
>>> >> >
>>> >> >
>>> >> >
>>> >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <ivandasch@gmail.com
>>> >:
>>> >> >
>>> >> > > I'd suggest to discuss this IEP and technical details in open ZOOM
>>> >> > > meeting.
>>> >> > >
>>> >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <
>>> ivandasch@gmail.com>:
>>> >> > >
>>> >> > >>
>>> >> > >>
>>> >> > >> ---------- Forwarded message ---------
>>> >> > >> От: Ivan Daschinsky <iv...@gmail.com>
>>> >> > >> Date: чт, 19 нояб. 2020 г. в 13:02
>>> >> > >> Subject: Re: IEP-61 Technical discussion
>>> >> > >> To: Alexey Goncharuk <al...@gmail.com>
>>> >> > >>
>>> >> > >>
>>> >> > >> Alexey, let's arise another question. Specifically, how nodes
>>> >> initially
>>> >> > >> find each other (discovery) and how they detect failures.
>>> >> > >>
>>> >> > >> I suppose, that gossip protocol is an ideal candidate. For
>>> example,
>>> >> > >> consul [1] uses this approach, using serf [2] library to discover
>>> >> > members
>>> >> > >> of cluster.
>>> >> > >> Then consul forms raft ensemble (server nodes) and client use
>>> raft
>>> >> > >> ensemble only as lock service.
>>> >> > >>
>>> >> > >> PacificA suggests internal heartbeats mechanism for failure
>>> >> detection of
>>> >> > >> replicated group, but it says nothing about initial discovery of
>>> >> nodes.
>>> >> > >>
>>> >> > >> WDYT?
>>> >> > >>
>>> >> > >> [1] -- https://www.consul.io/docs/architecture/gossip
>>> >> > >> [2] -- https://www.serf.io/
>>> >> > >>
>>> >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
>>> >> > >> alexey.goncharuk@gmail.com>:
>>> >> > >>
>>> >> > >>> Following up the Ignite 3.0 scope/development approach threads,
>>> >> this is
>>> >> > >>> a separate thread to discuss technical aspects of the IEP.
>>> >> > >>>
>>> >> > >>> Let's reiterate one more time on the questions raised by Ivan
>>> and
>>> >> also
>>> >> > >>> see if there are any other thoughts on the IEP:
>>> >> > >>>
>>> >> > >>>    - *Whether to deploy metastorage on a separate subset of the
>>> >> nodes
>>> >> > >>>    or allow Ignite to choose these nodes automatically.* I
>>> think it
>>> >> is
>>> >> > >>>    feasible to maintain both modes: by default, Ignite will
>>> choose
>>> >> > >>>    metastorage nodes automatically which essentially will
>>> provide
>>> >> the
>>> >> > same
>>> >> > >>>    seamless user experience as TCP discovery SPI - no separate
>>> >> roles,
>>> >> > >>>    simplistic deployment. For deployments where people want to
>>> have
>>> >> > more
>>> >> > >>>    fine-grained control over the nodes' assignments, we will
>>> >> provide a
>>> >> > runtime
>>> >> > >>>    configuration which will allow pinning metastorage group to
>>> >> certain
>>> >> > nodes,
>>> >> > >>>    thus eliminating the latency concerns.
>>> >> > >>>    - *Whether there are any TLA+ specs for the PacificA
>>> protocol.*
>>> >> Not
>>> >> > >>>    to my knowledge, but it is known to be used in production by
>>> >> > Microsoft and
>>> >> > >>>    other projects, e.g. [1]
>>> >> > >>>
>>> >> > >>> I would like to collect general feedback on the IEP, as well as
>>> >> > feedback
>>> >> > >>> on specific parts of it, such as:
>>> >> > >>>
>>> >> > >>>    - Metastorage API
>>> >> > >>>    - Any existing library that can be used to avoid
>>> re-implementing
>>> >> the
>>> >> > >>>    protocol ourselves? Perhaps, porting the existing
>>> implementation
>>> >> to
>>> >> > Java
>>> >> > >>>    (the way TiKV did with etcd-raft [2] [3]? This is a very
>>> neat way
>>> >> > btw in my
>>> >> > >>>    opinion because I like the finite automata-like approach of
>>> the
>>> >> > replication
>>> >> > >>>    module, and, additionally, we could sync bug fixes and
>>> >> improvements
>>> >> > from
>>> >> > >>>    the upstream project)
>>> >> > >>>
>>> >> > >>>
>>> >> > >>> Thanks,
>>> >> > >>> --AG
>>> >> > >>>
>>> >> > >>> [1]
>>> >> > >>>
>>> >> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
>>> >> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
>>> >> > >>> [3] https://github.com/tikv/raft-rs
>>> >> > >>>
>>> >> > >>
>>> >> > >>
>>> >> > >> --
>>> >> > >> Sincerely yours, Ivan Daschinskiy
>>> >> > >>
>>> >> > >>
>>> >> > >> --
>>> >> > >> Sincerely yours, Ivan Daschinskiy
>>> >> > >>
>>> >> > >
>>> >> > >
>>> >> > > --
>>> >> > > Sincerely yours, Ivan Daschinskiy
>>> >> > >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > Sincerely yours, Ivan Daschinskiy
>>> >> >
>>> >>
>>> >
>>> >
>>> > --
>>> > Sincerely yours, Ivan Daschinskiy
>>> >
>>>
>>>
>>> --
>>> Sincerely yours, Ivan Daschinskiy
>>>
>>

Re: IEP-61 Technical discussion

Posted by Alexey Goncharuk <al...@gmail.com>.

Folks,

I've made some edits in IEP-61 [1] regarding the group membership service
and transaction protocol interaction with the replication infrastructure,
please take a look before our Friday call.

[1]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-61%3A+Common+Replication+Infrastructure

пн, 23 нояб. 2020 г. в 13:28, Alexey Goncharuk <al...@gmail.com>:

> Thanks, Ivan,
>
> Another protocol for group membership worth checking out is RAPID [1] (a
> recent one). Not sure though if there are any available implementations for
> it already.
>
> [1] https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf
>
> пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <iv...@gmail.com>:
>
>> Also, here is some interesting reading about gossip, SWIM etc.
>>
>> 1 --
>> http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
>> 2 --
>>
>> http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
>> 3 -- https://github.com/hashicorp/memberlist (Foundation library of
>> hashicorp serf)
>> 4 -- https://github.com/scalecube/scalecube-cluster -- (Java
>> implementation
>> of SWIM)
>>
>> чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <iv...@gmail.com>:
>>
>> > >> Friday, Nov 27th work for you? If ok, let's have an open call then.
>> > Yes, great
>> > >> As for the protocol port - we will not be dealing with the
>> > concurrency...
>> > >>Judging by the Rust port, it seems fairly straightforward.
>> > Yes, they chose split transport and logic. But original Go package from
>> > etcd (see raft/node.go) contains some  heartbeats mechanism etc.
>> > I agree with you, this seems not to be a huge deal to port.
>> >
>> > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <
>> alexey.goncharuk@gmail.com
>> > >:
>> >
>> >> Ivan,
>> >>
>> >> Agree, let's have a call to discuss the IEP. I have some more thoughts
>> >> regarding how the replication infrastructure works with
>> >> atomic/transactional caches, will put this info to the IEP. Does next
>> >> Friday, Nov 27th work for you? If ok, let's have an open call then.
>> >>
>> >> As for the protocol port - we will not be dealing with the concurrency
>> >> model if we choose this way, this is what I like about their code
>> >> structure. Essentially, the raft module is a single-threaded automata
>> >> which
>> >> has a callback to process a message, process a tick (timeout) and
>> produces
>> >> messages that should be sent and log entries that should be persisted.
>> >> Judging by the Rust port, it seems fairly straightforward. Will be
>> happy
>> >> to
>> >> discuss this and other alternatives on the call as well.
>> >>
>> >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <iv...@gmail.com>:
>> >>
>> >> > > Any existing library that can be used to avoid re-implementing the
>> >> > protocol ourselves? Perhaps, porting the existing implementation to
>> Java
>> >> > Personally, I like this idea. Go libraries (either raft module of
>> etcd
>> >> or
>> >> > serf by Hashicorp) are famous for clean code, good design, stability,
>> >> not
>> >> > enormous size.
>> >> > But, on other side, Go has different model for concurrency and
>> porting
>> >> > probably will not be so straightforward.
>> >> >
>> >> >
>> >> >
>> >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <iv...@gmail.com>:
>> >> >
>> >> > > I'd suggest to discuss this IEP and technical details in open ZOOM
>> >> > > meeting.
>> >> > >
>> >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <ivandasch@gmail.com
>> >:
>> >> > >
>> >> > >>
>> >> > >>
>> >> > >> ---------- Forwarded message ---------
>> >> > >> От: Ivan Daschinsky <iv...@gmail.com>
>> >> > >> Date: чт, 19 нояб. 2020 г. в 13:02
>> >> > >> Subject: Re: IEP-61 Technical discussion
>> >> > >> To: Alexey Goncharuk <al...@gmail.com>
>> >> > >>
>> >> > >>
>> >> > >> Alexey, let's arise another question. Specifically, how nodes
>> >> initially
>> >> > >> find each other (discovery) and how they detect failures.
>> >> > >>
>> >> > >> I suppose, that gossip protocol is an ideal candidate. For
>> example,
>> >> > >> consul [1] uses this approach, using serf [2] library to discover
>> >> > members
>> >> > >> of cluster.
>> >> > >> Then consul forms raft ensemble (server nodes) and client use raft
>> >> > >> ensemble only as lock service.
>> >> > >>
>> >> > >> PacificA suggests internal heartbeats mechanism for failure
>> >> detection of
>> >> > >> replicated group, but it says nothing about initial discovery of
>> >> nodes.
>> >> > >>
>> >> > >> WDYT?
>> >> > >>
>> >> > >> [1] -- https://www.consul.io/docs/architecture/gossip
>> >> > >> [2] -- https://www.serf.io/
>> >> > >>
>> >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
>> >> > >> alexey.goncharuk@gmail.com>:
>> >> > >>
>> >> > >>> Following up the Ignite 3.0 scope/development approach threads,
>> >> this is
>> >> > >>> a separate thread to discuss technical aspects of the IEP.
>> >> > >>>
>> >> > >>> Let's reiterate one more time on the questions raised by Ivan and
>> >> also
>> >> > >>> see if there are any other thoughts on the IEP:
>> >> > >>>
>> >> > >>>    - *Whether to deploy metastorage on a separate subset of the
>> >> nodes
>> >> > >>>    or allow Ignite to choose these nodes automatically.* I think
>> it
>> >> is
>> >> > >>>    feasible to maintain both modes: by default, Ignite will
>> choose
>> >> > >>>    metastorage nodes automatically which essentially will provide
>> >> the
>> >> > same
>> >> > >>>    seamless user experience as TCP discovery SPI - no separate
>> >> roles,
>> >> > >>>    simplistic deployment. For deployments where people want to
>> have
>> >> > more
>> >> > >>>    fine-grained control over the nodes' assignments, we will
>> >> provide a
>> >> > runtime
>> >> > >>>    configuration which will allow pinning metastorage group to
>> >> certain
>> >> > nodes,
>> >> > >>>    thus eliminating the latency concerns.
>> >> > >>>    - *Whether there are any TLA+ specs for the PacificA
>> protocol.*
>> >> Not
>> >> > >>>    to my knowledge, but it is known to be used in production by
>> >> > Microsoft and
>> >> > >>>    other projects, e.g. [1]
>> >> > >>>
>> >> > >>> I would like to collect general feedback on the IEP, as well as
>> >> > feedback
>> >> > >>> on specific parts of it, such as:
>> >> > >>>
>> >> > >>>    - Metastorage API
>> >> > >>>    - Any existing library that can be used to avoid
>> re-implementing
>> >> the
>> >> > >>>    protocol ourselves? Perhaps, porting the existing
>> implementation
>> >> to
>> >> > Java
>> >> > >>>    (the way TiKV did with etcd-raft [2] [3]? This is a very neat
>> way
>> >> > btw in my
>> >> > >>>    opinion because I like the finite automata-like approach of
>> the
>> >> > replication
>> >> > >>>    module, and, additionally, we could sync bug fixes and
>> >> improvements
>> >> > from
>> >> > >>>    the upstream project)
>> >> > >>>
>> >> > >>>
>> >> > >>> Thanks,
>> >> > >>> --AG
>> >> > >>>
>> >> > >>> [1]
>> >> > >>>
>> >> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
>> >> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
>> >> > >>> [3] https://github.com/tikv/raft-rs
>> >> > >>>
>> >> > >>
>> >> > >>
>> >> > >> --
>> >> > >> Sincerely yours, Ivan Daschinskiy
>> >> > >>
>> >> > >>
>> >> > >> --
>> >> > >> Sincerely yours, Ivan Daschinskiy
>> >> > >>
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Sincerely yours, Ivan Daschinskiy
>> >> > >
>> >> >
>> >> >
>> >> > --
>> >> > Sincerely yours, Ivan Daschinskiy
>> >> >
>> >>
>> >
>> >
>> > --
>> > Sincerely yours, Ivan Daschinskiy
>> >
>>
>>
>> --
>> Sincerely yours, Ivan Daschinskiy
>>
>

Re: IEP-61 Technical discussion

Posted by Alexey Goncharuk <al...@gmail.com>.

Thanks, Ivan,

Another protocol for group membership worth checking out is RAPID [1] (a
recent one). Not sure though if there are any available implementations for
it already.

[1] https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf

пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <iv...@gmail.com>:

> Also, here is some interesting reading about gossip, SWIM etc.
>
> 1 --
> http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
> 2 --
> http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
> 3 -- https://github.com/hashicorp/memberlist (Foundation library of
> hashicorp serf)
> 4 -- https://github.com/scalecube/scalecube-cluster -- (Java
> implementation
> of SWIM)
>
> чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <iv...@gmail.com>:
>
> > >> Friday, Nov 27th work for you? If ok, let's have an open call then.
> > Yes, great
> > >> As for the protocol port - we will not be dealing with the
> > concurrency...
> > >>Judging by the Rust port, it seems fairly straightforward.
> > Yes, they chose split transport and logic. But original Go package from
> > etcd (see raft/node.go) contains some  heartbeats mechanism etc.
> > I agree with you, this seems not to be a huge deal to port.
> >
> > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <
> alexey.goncharuk@gmail.com
> > >:
> >
> >> Ivan,
> >>
> >> Agree, let's have a call to discuss the IEP. I have some more thoughts
> >> regarding how the replication infrastructure works with
> >> atomic/transactional caches, will put this info to the IEP. Does next
> >> Friday, Nov 27th work for you? If ok, let's have an open call then.
> >>
> >> As for the protocol port - we will not be dealing with the concurrency
> >> model if we choose this way, this is what I like about their code
> >> structure. Essentially, the raft module is a single-threaded automata
> >> which
> >> has a callback to process a message, process a tick (timeout) and
> produces
> >> messages that should be sent and log entries that should be persisted.
> >> Judging by the Rust port, it seems fairly straightforward. Will be happy
> >> to
> >> discuss this and other alternatives on the call as well.
> >>
> >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <iv...@gmail.com>:
> >>
> >> > > Any existing library that can be used to avoid re-implementing the
> >> > protocol ourselves? Perhaps, porting the existing implementation to
> Java
> >> > Personally, I like this idea. Go libraries (either raft module of etcd
> >> or
> >> > serf by Hashicorp) are famous for clean code, good design, stability,
> >> not
> >> > enormous size.
> >> > But, on other side, Go has different model for concurrency and porting
> >> > probably will not be so straightforward.
> >> >
> >> >
> >> >
> >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <iv...@gmail.com>:
> >> >
> >> > > I'd suggest to discuss this IEP and technical details in open ZOOM
> >> > > meeting.
> >> > >
> >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <ivandasch@gmail.com
> >:
> >> > >
> >> > >>
> >> > >>
> >> > >> ---------- Forwarded message ---------
> >> > >> От: Ivan Daschinsky <iv...@gmail.com>
> >> > >> Date: чт, 19 нояб. 2020 г. в 13:02
> >> > >> Subject: Re: IEP-61 Technical discussion
> >> > >> To: Alexey Goncharuk <al...@gmail.com>
> >> > >>
> >> > >>
> >> > >> Alexey, let's arise another question. Specifically, how nodes
> >> initially
> >> > >> find each other (discovery) and how they detect failures.
> >> > >>
> >> > >> I suppose, that gossip protocol is an ideal candidate. For example,
> >> > >> consul [1] uses this approach, using serf [2] library to discover
> >> > members
> >> > >> of cluster.
> >> > >> Then consul forms raft ensemble (server nodes) and client use raft
> >> > >> ensemble only as lock service.
> >> > >>
> >> > >> PacificA suggests internal heartbeats mechanism for failure
> >> detection of
> >> > >> replicated group, but it says nothing about initial discovery of
> >> nodes.
> >> > >>
> >> > >> WDYT?
> >> > >>
> >> > >> [1] -- https://www.consul.io/docs/architecture/gossip
> >> > >> [2] -- https://www.serf.io/
> >> > >>
> >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
> >> > >> alexey.goncharuk@gmail.com>:
> >> > >>
> >> > >>> Following up the Ignite 3.0 scope/development approach threads,
> >> this is
> >> > >>> a separate thread to discuss technical aspects of the IEP.
> >> > >>>
> >> > >>> Let's reiterate one more time on the questions raised by Ivan and
> >> also
> >> > >>> see if there are any other thoughts on the IEP:
> >> > >>>
> >> > >>>    - *Whether to deploy metastorage on a separate subset of the
> >> nodes
> >> > >>>    or allow Ignite to choose these nodes automatically.* I think
> it
> >> is
> >> > >>>    feasible to maintain both modes: by default, Ignite will choose
> >> > >>>    metastorage nodes automatically which essentially will provide
> >> the
> >> > same
> >> > >>>    seamless user experience as TCP discovery SPI - no separate
> >> roles,
> >> > >>>    simplistic deployment. For deployments where people want to
> have
> >> > more
> >> > >>>    fine-grained control over the nodes' assignments, we will
> >> provide a
> >> > runtime
> >> > >>>    configuration which will allow pinning metastorage group to
> >> certain
> >> > nodes,
> >> > >>>    thus eliminating the latency concerns.
> >> > >>>    - *Whether there are any TLA+ specs for the PacificA protocol.*
> >> Not
> >> > >>>    to my knowledge, but it is known to be used in production by
> >> > Microsoft and
> >> > >>>    other projects, e.g. [1]
> >> > >>>
> >> > >>> I would like to collect general feedback on the IEP, as well as
> >> > feedback
> >> > >>> on specific parts of it, such as:
> >> > >>>
> >> > >>>    - Metastorage API
> >> > >>>    - Any existing library that can be used to avoid
> re-implementing
> >> the
> >> > >>>    protocol ourselves? Perhaps, porting the existing
> implementation
> >> to
> >> > Java
> >> > >>>    (the way TiKV did with etcd-raft [2] [3]? This is a very neat
> way
> >> > btw in my
> >> > >>>    opinion because I like the finite automata-like approach of the
> >> > replication
> >> > >>>    module, and, additionally, we could sync bug fixes and
> >> improvements
> >> > from
> >> > >>>    the upstream project)
> >> > >>>
> >> > >>>
> >> > >>> Thanks,
> >> > >>> --AG
> >> > >>>
> >> > >>> [1]
> >> > >>>
> >> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
> >> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
> >> > >>> [3] https://github.com/tikv/raft-rs
> >> > >>>
> >> > >>
> >> > >>
> >> > >> --
> >> > >> Sincerely yours, Ivan Daschinskiy
> >> > >>
> >> > >>
> >> > >> --
> >> > >> Sincerely yours, Ivan Daschinskiy
> >> > >>
> >> > >
> >> > >
> >> > > --
> >> > > Sincerely yours, Ivan Daschinskiy
> >> > >
> >> >
> >> >
> >> > --
> >> > Sincerely yours, Ivan Daschinskiy
> >> >
> >>
> >
> >
> > --
> > Sincerely yours, Ivan Daschinskiy
> >
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>

Re: IEP-61 Technical discussion

Posted by Ivan Daschinsky <iv...@gmail.com>.

Also, here is some interesting reading about gossip, SWIM etc.

1 -- http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
2 --
http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
3 -- https://github.com/hashicorp/memberlist (Foundation library of
hashicorp serf)
4 -- https://github.com/scalecube/scalecube-cluster -- (Java implementation
of SWIM)

чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <iv...@gmail.com>:

> >> Friday, Nov 27th work for you? If ok, let's have an open call then.
> Yes, great
> >> As for the protocol port - we will not be dealing with the
> concurrency...
> >>Judging by the Rust port, it seems fairly straightforward.
> Yes, they chose split transport and logic. But original Go package from
> etcd (see raft/node.go) contains some  heartbeats mechanism etc.
> I agree with you, this seems not to be a huge deal to port.
>
> чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <alexey.goncharuk@gmail.com
> >:
>
>> Ivan,
>>
>> Agree, let's have a call to discuss the IEP. I have some more thoughts
>> regarding how the replication infrastructure works with
>> atomic/transactional caches, will put this info to the IEP. Does next
>> Friday, Nov 27th work for you? If ok, let's have an open call then.
>>
>> As for the protocol port - we will not be dealing with the concurrency
>> model if we choose this way, this is what I like about their code
>> structure. Essentially, the raft module is a single-threaded automata
>> which
>> has a callback to process a message, process a tick (timeout) and produces
>> messages that should be sent and log entries that should be persisted.
>> Judging by the Rust port, it seems fairly straightforward. Will be happy
>> to
>> discuss this and other alternatives on the call as well.
>>
>> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <iv...@gmail.com>:
>>
>> > > Any existing library that can be used to avoid re-implementing the
>> > protocol ourselves? Perhaps, porting the existing implementation to Java
>> > Personally, I like this idea. Go libraries (either raft module of etcd
>> or
>> > serf by Hashicorp) are famous for clean code, good design, stability,
>> not
>> > enormous size.
>> > But, on other side, Go has different model for concurrency and porting
>> > probably will not be so straightforward.
>> >
>> >
>> >
>> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <iv...@gmail.com>:
>> >
>> > > I'd suggest to discuss this IEP and technical details in open ZOOM
>> > > meeting.
>> > >
>> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <iv...@gmail.com>:
>> > >
>> > >>
>> > >>
>> > >> ---------- Forwarded message ---------
>> > >> От: Ivan Daschinsky <iv...@gmail.com>
>> > >> Date: чт, 19 нояб. 2020 г. в 13:02
>> > >> Subject: Re: IEP-61 Technical discussion
>> > >> To: Alexey Goncharuk <al...@gmail.com>
>> > >>
>> > >>
>> > >> Alexey, let's arise another question. Specifically, how nodes
>> initially
>> > >> find each other (discovery) and how they detect failures.
>> > >>
>> > >> I suppose, that gossip protocol is an ideal candidate. For example,
>> > >> consul [1] uses this approach, using serf [2] library to discover
>> > members
>> > >> of cluster.
>> > >> Then consul forms raft ensemble (server nodes) and client use raft
>> > >> ensemble only as lock service.
>> > >>
>> > >> PacificA suggests internal heartbeats mechanism for failure
>> detection of
>> > >> replicated group, but it says nothing about initial discovery of
>> nodes.
>> > >>
>> > >> WDYT?
>> > >>
>> > >> [1] -- https://www.consul.io/docs/architecture/gossip
>> > >> [2] -- https://www.serf.io/
>> > >>
>> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
>> > >> alexey.goncharuk@gmail.com>:
>> > >>
>> > >>> Following up the Ignite 3.0 scope/development approach threads,
>> this is
>> > >>> a separate thread to discuss technical aspects of the IEP.
>> > >>>
>> > >>> Let's reiterate one more time on the questions raised by Ivan and
>> also
>> > >>> see if there are any other thoughts on the IEP:
>> > >>>
>> > >>>    - *Whether to deploy metastorage on a separate subset of the
>> nodes
>> > >>>    or allow Ignite to choose these nodes automatically.* I think it
>> is
>> > >>>    feasible to maintain both modes: by default, Ignite will choose
>> > >>>    metastorage nodes automatically which essentially will provide
>> the
>> > same
>> > >>>    seamless user experience as TCP discovery SPI - no separate
>> roles,
>> > >>>    simplistic deployment. For deployments where people want to have
>> > more
>> > >>>    fine-grained control over the nodes' assignments, we will
>> provide a
>> > runtime
>> > >>>    configuration which will allow pinning metastorage group to
>> certain
>> > nodes,
>> > >>>    thus eliminating the latency concerns.
>> > >>>    - *Whether there are any TLA+ specs for the PacificA protocol.*
>> Not
>> > >>>    to my knowledge, but it is known to be used in production by
>> > Microsoft and
>> > >>>    other projects, e.g. [1]
>> > >>>
>> > >>> I would like to collect general feedback on the IEP, as well as
>> > feedback
>> > >>> on specific parts of it, such as:
>> > >>>
>> > >>>    - Metastorage API
>> > >>>    - Any existing library that can be used to avoid re-implementing
>> the
>> > >>>    protocol ourselves? Perhaps, porting the existing implementation
>> to
>> > Java
>> > >>>    (the way TiKV did with etcd-raft [2] [3]? This is a very neat way
>> > btw in my
>> > >>>    opinion because I like the finite automata-like approach of the
>> > replication
>> > >>>    module, and, additionally, we could sync bug fixes and
>> improvements
>> > from
>> > >>>    the upstream project)
>> > >>>
>> > >>>
>> > >>> Thanks,
>> > >>> --AG
>> > >>>
>> > >>> [1]
>> > >>>
>> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
>> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
>> > >>> [3] https://github.com/tikv/raft-rs
>> > >>>
>> > >>
>> > >>
>> > >> --
>> > >> Sincerely yours, Ivan Daschinskiy
>> > >>
>> > >>
>> > >> --
>> > >> Sincerely yours, Ivan Daschinskiy
>> > >>
>> > >
>> > >
>> > > --
>> > > Sincerely yours, Ivan Daschinskiy
>> > >
>> >
>> >
>> > --
>> > Sincerely yours, Ivan Daschinskiy
>> >
>>
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: IEP-61 Technical discussion

Posted by Ivan Daschinsky <iv...@gmail.com>.

>> Friday, Nov 27th work for you? If ok, let's have an open call then.
Yes, great
>> As for the protocol port - we will not be dealing with the concurrency...
>>Judging by the Rust port, it seems fairly straightforward.
Yes, they chose split transport and logic. But original Go package from
etcd (see raft/node.go) contains some  heartbeats mechanism etc.
I agree with you, this seems not to be a huge deal to port.

чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <al...@gmail.com>:

> Ivan,
>
> Agree, let's have a call to discuss the IEP. I have some more thoughts
> regarding how the replication infrastructure works with
> atomic/transactional caches, will put this info to the IEP. Does next
> Friday, Nov 27th work for you? If ok, let's have an open call then.
>
> As for the protocol port - we will not be dealing with the concurrency
> model if we choose this way, this is what I like about their code
> structure. Essentially, the raft module is a single-threaded automata which
> has a callback to process a message, process a tick (timeout) and produces
> messages that should be sent and log entries that should be persisted.
> Judging by the Rust port, it seems fairly straightforward. Will be happy to
> discuss this and other alternatives on the call as well.
>
> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <iv...@gmail.com>:
>
> > > Any existing library that can be used to avoid re-implementing the
> > protocol ourselves? Perhaps, porting the existing implementation to Java
> > Personally, I like this idea. Go libraries (either raft module of etcd or
> > serf by Hashicorp) are famous for clean code, good design, stability, not
> > enormous size.
> > But, on other side, Go has different model for concurrency and porting
> > probably will not be so straightforward.
> >
> >
> >
> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <iv...@gmail.com>:
> >
> > > I'd suggest to discuss this IEP and technical details in open ZOOM
> > > meeting.
> > >
> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <iv...@gmail.com>:
> > >
> > >>
> > >>
> > >> ---------- Forwarded message ---------
> > >> От: Ivan Daschinsky <iv...@gmail.com>
> > >> Date: чт, 19 нояб. 2020 г. в 13:02
> > >> Subject: Re: IEP-61 Technical discussion
> > >> To: Alexey Goncharuk <al...@gmail.com>
> > >>
> > >>
> > >> Alexey, let's arise another question. Specifically, how nodes
> initially
> > >> find each other (discovery) and how they detect failures.
> > >>
> > >> I suppose, that gossip protocol is an ideal candidate. For example,
> > >> consul [1] uses this approach, using serf [2] library to discover
> > members
> > >> of cluster.
> > >> Then consul forms raft ensemble (server nodes) and client use raft
> > >> ensemble only as lock service.
> > >>
> > >> PacificA suggests internal heartbeats mechanism for failure detection
> of
> > >> replicated group, but it says nothing about initial discovery of
> nodes.
> > >>
> > >> WDYT?
> > >>
> > >> [1] -- https://www.consul.io/docs/architecture/gossip
> > >> [2] -- https://www.serf.io/
> > >>
> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
> > >> alexey.goncharuk@gmail.com>:
> > >>
> > >>> Following up the Ignite 3.0 scope/development approach threads, this
> is
> > >>> a separate thread to discuss technical aspects of the IEP.
> > >>>
> > >>> Let's reiterate one more time on the questions raised by Ivan and
> also
> > >>> see if there are any other thoughts on the IEP:
> > >>>
> > >>>    - *Whether to deploy metastorage on a separate subset of the nodes
> > >>>    or allow Ignite to choose these nodes automatically.* I think it
> is
> > >>>    feasible to maintain both modes: by default, Ignite will choose
> > >>>    metastorage nodes automatically which essentially will provide the
> > same
> > >>>    seamless user experience as TCP discovery SPI - no separate roles,
> > >>>    simplistic deployment. For deployments where people want to have
> > more
> > >>>    fine-grained control over the nodes' assignments, we will provide
> a
> > runtime
> > >>>    configuration which will allow pinning metastorage group to
> certain
> > nodes,
> > >>>    thus eliminating the latency concerns.
> > >>>    - *Whether there are any TLA+ specs for the PacificA protocol.*
> Not
> > >>>    to my knowledge, but it is known to be used in production by
> > Microsoft and
> > >>>    other projects, e.g. [1]
> > >>>
> > >>> I would like to collect general feedback on the IEP, as well as
> > feedback
> > >>> on specific parts of it, such as:
> > >>>
> > >>>    - Metastorage API
> > >>>    - Any existing library that can be used to avoid re-implementing
> the
> > >>>    protocol ourselves? Perhaps, porting the existing implementation
> to
> > Java
> > >>>    (the way TiKV did with etcd-raft [2] [3]? This is a very neat way
> > btw in my
> > >>>    opinion because I like the finite automata-like approach of the
> > replication
> > >>>    module, and, additionally, we could sync bug fixes and
> improvements
> > from
> > >>>    the upstream project)
> > >>>
> > >>>
> > >>> Thanks,
> > >>> --AG
> > >>>
> > >>> [1]
> > >>>
> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
> > >>> [3] https://github.com/tikv/raft-rs
> > >>>
> > >>
> > >>
> > >> --
> > >> Sincerely yours, Ivan Daschinskiy
> > >>
> > >>
> > >> --
> > >> Sincerely yours, Ivan Daschinskiy
> > >>
> > >
> > >
> > > --
> > > Sincerely yours, Ivan Daschinskiy
> > >
> >
> >
> > --
> > Sincerely yours, Ivan Daschinskiy
> >
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: IEP-61 Technical discussion

Posted by Alexey Goncharuk <al...@gmail.com>.

Ivan,

Agree, let's have a call to discuss the IEP. I have some more thoughts
regarding how the replication infrastructure works with
atomic/transactional caches, will put this info to the IEP. Does next
Friday, Nov 27th work for you? If ok, let's have an open call then.

As for the protocol port - we will not be dealing with the concurrency
model if we choose this way, this is what I like about their code
structure. Essentially, the raft module is a single-threaded automata which
has a callback to process a message, process a tick (timeout) and produces
messages that should be sent and log entries that should be persisted.
Judging by the Rust port, it seems fairly straightforward. Will be happy to
discuss this and other alternatives on the call as well.

чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <iv...@gmail.com>:

> > Any existing library that can be used to avoid re-implementing the
> protocol ourselves? Perhaps, porting the existing implementation to Java
> Personally, I like this idea. Go libraries (either raft module of etcd or
> serf by Hashicorp) are famous for clean code, good design, stability, not
> enormous size.
> But, on other side, Go has different model for concurrency and porting
> probably will not be so straightforward.
>
>
>
> чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <iv...@gmail.com>:
>
> > I'd suggest to discuss this IEP and technical details in open ZOOM
> > meeting.
> >
> > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <iv...@gmail.com>:
> >
> >>
> >>
> >> ---------- Forwarded message ---------
> >> От: Ivan Daschinsky <iv...@gmail.com>
> >> Date: чт, 19 нояб. 2020 г. в 13:02
> >> Subject: Re: IEP-61 Technical discussion
> >> To: Alexey Goncharuk <al...@gmail.com>
> >>
> >>
> >> Alexey, let's arise another question. Specifically, how nodes initially
> >> find each other (discovery) and how they detect failures.
> >>
> >> I suppose, that gossip protocol is an ideal candidate. For example,
> >> consul [1] uses this approach, using serf [2] library to discover
> members
> >> of cluster.
> >> Then consul forms raft ensemble (server nodes) and client use raft
> >> ensemble only as lock service.
> >>
> >> PacificA suggests internal heartbeats mechanism for failure detection of
> >> replicated group, but it says nothing about initial discovery of nodes.
> >>
> >> WDYT?
> >>
> >> [1] -- https://www.consul.io/docs/architecture/gossip
> >> [2] -- https://www.serf.io/
> >>
> >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
> >> alexey.goncharuk@gmail.com>:
> >>
> >>> Following up the Ignite 3.0 scope/development approach threads, this is
> >>> a separate thread to discuss technical aspects of the IEP.
> >>>
> >>> Let's reiterate one more time on the questions raised by Ivan and also
> >>> see if there are any other thoughts on the IEP:
> >>>
> >>>    - *Whether to deploy metastorage on a separate subset of the nodes
> >>>    or allow Ignite to choose these nodes automatically.* I think it is
> >>>    feasible to maintain both modes: by default, Ignite will choose
> >>>    metastorage nodes automatically which essentially will provide the
> same
> >>>    seamless user experience as TCP discovery SPI - no separate roles,
> >>>    simplistic deployment. For deployments where people want to have
> more
> >>>    fine-grained control over the nodes' assignments, we will provide a
> runtime
> >>>    configuration which will allow pinning metastorage group to certain
> nodes,
> >>>    thus eliminating the latency concerns.
> >>>    - *Whether there are any TLA+ specs for the PacificA protocol.* Not
> >>>    to my knowledge, but it is known to be used in production by
> Microsoft and
> >>>    other projects, e.g. [1]
> >>>
> >>> I would like to collect general feedback on the IEP, as well as
> feedback
> >>> on specific parts of it, such as:
> >>>
> >>>    - Metastorage API
> >>>    - Any existing library that can be used to avoid re-implementing the
> >>>    protocol ourselves? Perhaps, porting the existing implementation to
> Java
> >>>    (the way TiKV did with etcd-raft [2] [3]? This is a very neat way
> btw in my
> >>>    opinion because I like the finite automata-like approach of the
> replication
> >>>    module, and, additionally, we could sync bug fixes and improvements
> from
> >>>    the upstream project)
> >>>
> >>>
> >>> Thanks,
> >>> --AG
> >>>
> >>> [1]
> >>> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
> >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
> >>> [3] https://github.com/tikv/raft-rs
> >>>
> >>
> >>
> >> --
> >> Sincerely yours, Ivan Daschinskiy
> >>
> >>
> >> --
> >> Sincerely yours, Ivan Daschinskiy
> >>
> >
> >
> > --
> > Sincerely yours, Ivan Daschinskiy
> >
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>

Re: IEP-61 Technical discussion

Posted by Ivan Daschinsky <iv...@gmail.com>.

> Any existing library that can be used to avoid re-implementing the
protocol ourselves? Perhaps, porting the existing implementation to Java
Personally, I like this idea. Go libraries (either raft module of etcd or
serf by Hashicorp) are famous for clean code, good design, stability, not
enormous size.
But, on other side, Go has different model for concurrency and porting
probably will not be so straightforward.



чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <iv...@gmail.com>:

> I'd suggest to discuss this IEP and technical details in open ZOOM
> meeting.
>
> чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <iv...@gmail.com>:
>
>>
>>
>> ---------- Forwarded message ---------
>> От: Ivan Daschinsky <iv...@gmail.com>
>> Date: чт, 19 нояб. 2020 г. в 13:02
>> Subject: Re: IEP-61 Technical discussion
>> To: Alexey Goncharuk <al...@gmail.com>
>>
>>
>> Alexey, let's arise another question. Specifically, how nodes initially
>> find each other (discovery) and how they detect failures.
>>
>> I suppose, that gossip protocol is an ideal candidate. For example,
>> consul [1] uses this approach, using serf [2] library to discover members
>> of cluster.
>> Then consul forms raft ensemble (server nodes) and client use raft
>> ensemble only as lock service.
>>
>> PacificA suggests internal heartbeats mechanism for failure detection of
>> replicated group, but it says nothing about initial discovery of nodes.
>>
>> WDYT?
>>
>> [1] -- https://www.consul.io/docs/architecture/gossip
>> [2] -- https://www.serf.io/
>>
>> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
>> alexey.goncharuk@gmail.com>:
>>
>>> Following up the Ignite 3.0 scope/development approach threads, this is
>>> a separate thread to discuss technical aspects of the IEP.
>>>
>>> Let's reiterate one more time on the questions raised by Ivan and also
>>> see if there are any other thoughts on the IEP:
>>>
>>>    - *Whether to deploy metastorage on a separate subset of the nodes
>>>    or allow Ignite to choose these nodes automatically.* I think it is
>>>    feasible to maintain both modes: by default, Ignite will choose
>>>    metastorage nodes automatically which essentially will provide the same
>>>    seamless user experience as TCP discovery SPI - no separate roles,
>>>    simplistic deployment. For deployments where people want to have more
>>>    fine-grained control over the nodes' assignments, we will provide a runtime
>>>    configuration which will allow pinning metastorage group to certain nodes,
>>>    thus eliminating the latency concerns.
>>>    - *Whether there are any TLA+ specs for the PacificA protocol.* Not
>>>    to my knowledge, but it is known to be used in production by Microsoft and
>>>    other projects, e.g. [1]
>>>
>>> I would like to collect general feedback on the IEP, as well as feedback
>>> on specific parts of it, such as:
>>>
>>>    - Metastorage API
>>>    - Any existing library that can be used to avoid re-implementing the
>>>    protocol ourselves? Perhaps, porting the existing implementation to Java
>>>    (the way TiKV did with etcd-raft [2] [3]? This is a very neat way btw in my
>>>    opinion because I like the finite automata-like approach of the replication
>>>    module, and, additionally, we could sync bug fixes and improvements from
>>>    the upstream project)
>>>
>>>
>>> Thanks,
>>> --AG
>>>
>>> [1]
>>> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
>>> [2] https://github.com/etcd-io/etcd/tree/master/raft
>>> [3] https://github.com/tikv/raft-rs
>>>
>>
>>
>> --
>> Sincerely yours, Ivan Daschinskiy
>>
>>
>> --
>> Sincerely yours, Ivan Daschinskiy
>>
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: IEP-61 Technical discussion

Posted by Ivan Daschinsky <iv...@gmail.com>.

I'd suggest to discuss this IEP and technical details in open ZOOM meeting.

чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <iv...@gmail.com>:

>
>
> ---------- Forwarded message ---------
> От: Ivan Daschinsky <iv...@gmail.com>
> Date: чт, 19 нояб. 2020 г. в 13:02
> Subject: Re: IEP-61 Technical discussion
> To: Alexey Goncharuk <al...@gmail.com>
>
>
> Alexey, let's arise another question. Specifically, how nodes initially
> find each other (discovery) and how they detect failures.
>
> I suppose, that gossip protocol is an ideal candidate. For example, consul
> [1] uses this approach, using serf [2] library to discover members of
> cluster.
> Then consul forms raft ensemble (server nodes) and client use raft
> ensemble only as lock service.
>
> PacificA suggests internal heartbeats mechanism for failure detection of
> replicated group, but it says nothing about initial discovery of nodes.
>
> WDYT?
>
> [1] -- https://www.consul.io/docs/architecture/gossip
> [2] -- https://www.serf.io/
>
> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <alexey.goncharuk@gmail.com
> >:
>
>> Following up the Ignite 3.0 scope/development approach threads, this is a
>> separate thread to discuss technical aspects of the IEP.
>>
>> Let's reiterate one more time on the questions raised by Ivan and also
>> see if there are any other thoughts on the IEP:
>>
>>    - *Whether to deploy metastorage on a separate subset of the nodes or
>>    allow Ignite to choose these nodes automatically.* I think it is
>>    feasible to maintain both modes: by default, Ignite will choose
>>    metastorage nodes automatically which essentially will provide the same
>>    seamless user experience as TCP discovery SPI - no separate roles,
>>    simplistic deployment. For deployments where people want to have more
>>    fine-grained control over the nodes' assignments, we will provide a runtime
>>    configuration which will allow pinning metastorage group to certain nodes,
>>    thus eliminating the latency concerns.
>>    - *Whether there are any TLA+ specs for the PacificA protocol.* Not
>>    to my knowledge, but it is known to be used in production by Microsoft and
>>    other projects, e.g. [1]
>>
>> I would like to collect general feedback on the IEP, as well as feedback
>> on specific parts of it, such as:
>>
>>    - Metastorage API
>>    - Any existing library that can be used to avoid re-implementing the
>>    protocol ourselves? Perhaps, porting the existing implementation to Java
>>    (the way TiKV did with etcd-raft [2] [3]? This is a very neat way btw in my
>>    opinion because I like the finite automata-like approach of the replication
>>    module, and, additionally, we could sync bug fixes and improvements from
>>    the upstream project)
>>
>>
>> Thanks,
>> --AG
>>
>> [1] https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
>> [2] https://github.com/etcd-io/etcd/tree/master/raft
>> [3] https://github.com/tikv/raft-rs
>>
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>


-- 
Sincerely yours, Ivan Daschinskiy

Fwd: IEP-61 Technical discussion

Posted by Ivan Daschinsky <iv...@gmail.com>.

---------- Forwarded message ---------
От: Ivan Daschinsky <iv...@gmail.com>
Date: чт, 19 нояб. 2020 г. в 13:02
Subject: Re: IEP-61 Technical discussion
To: Alexey Goncharuk <al...@gmail.com>


Alexey, let's arise another question. Specifically, how nodes initially
find each other (discovery) and how they detect failures.

I suppose, that gossip protocol is an ideal candidate. For example, consul
[1] uses this approach, using serf [2] library to discover members of
cluster.
Then consul forms raft ensemble (server nodes) and client use raft ensemble
only as lock service.

PacificA suggests internal heartbeats mechanism for failure detection of
replicated group, but it says nothing about initial discovery of nodes.

WDYT?

[1] -- https://www.consul.io/docs/architecture/gossip
[2] -- https://www.serf.io/

чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <al...@gmail.com>:

> Following up the Ignite 3.0 scope/development approach threads, this is a
> separate thread to discuss technical aspects of the IEP.
>
> Let's reiterate one more time on the questions raised by Ivan and also see
> if there are any other thoughts on the IEP:
>
>    - *Whether to deploy metastorage on a separate subset of the nodes or
>    allow Ignite to choose these nodes automatically.* I think it is
>    feasible to maintain both modes: by default, Ignite will choose
>    metastorage nodes automatically which essentially will provide the same
>    seamless user experience as TCP discovery SPI - no separate roles,
>    simplistic deployment. For deployments where people want to have more
>    fine-grained control over the nodes' assignments, we will provide a runtime
>    configuration which will allow pinning metastorage group to certain nodes,
>    thus eliminating the latency concerns.
>    - *Whether there are any TLA+ specs for the PacificA protocol.* Not to
>    my knowledge, but it is known to be used in production by Microsoft and
>    other projects, e.g. [1]
>
> I would like to collect general feedback on the IEP, as well as feedback
> on specific parts of it, such as:
>
>    - Metastorage API
>    - Any existing library that can be used to avoid re-implementing the
>    protocol ourselves? Perhaps, porting the existing implementation to Java
>    (the way TiKV did with etcd-raft [2] [3]? This is a very neat way btw in my
>    opinion because I like the finite automata-like approach of the replication
>    module, and, additionally, we could sync bug fixes and improvements from
>    the upstream project)
>
>
> Thanks,
> --AG
>
> [1] https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
> [2] https://github.com/etcd-io/etcd/tree/master/raft
> [3] https://github.com/tikv/raft-rs
>


-- 
Sincerely yours, Ivan Daschinskiy


-- 
Sincerely yours, Ivan Daschinskiy