You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@helix.apache.org by Varun Sharma <va...@pinterest.com> on 2015/06/09 00:07:26 UTC

Master-Master state machine for helix

Hi,

We are looking into the possibility of implementing a Master-Master state
machine using Helix. The idea is that the client is writing to n- replicas
of a partition - these replicas are located on different machines. There is
no communication b/w the replicas themselves and data is replicated through
multiple writes.

Each replica exposes the following api(s):
a) backup() - backs up at a sequence number
b) getUpdatesSince(seq_no) - Get the updates since a given seq no
c) setReadOnly(int partition) - make a partition RO

We are trying to see if we can use helix to automate the shard copy
operation:
1) Node goes down and controller computes new shard placements
For a particular shard
2) Target replica goes into "BACKUP" state, in which it finds another
replica which is serving the same shard using RoutingTableProvider and then
copies it over.
3) Target replica goes into "SYNC" state, in which it uses the
getUpdatesSince API to keep sync'ing with the source replica - this goes on
indefinitely
4) Controller sets the "source replica" to RO (not a helix state)
5) Target replica catches up and moves to MASTER state (online)
6) Controller sees no one is syncing from target replica anymore and hence
remarks the shard as not "RO"

I am wondering if this could work with Helix. The major issue here is that
some of these transitions need to be sequenced in a particular manner for
the target replica and the source replica. Does helix have the ability to
make the participants initiate state transitions ?

Thanks !
Varun