You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iotdb.apache.org by 徐毅 <xu...@126.com> on 2019/03/28 04:32:36 UTC

IoTDB supports distributed version



Hi,




IoTDB only supports stand-alone version now. We plan to develop distributed version in next two months. 

We initially decided to use the master-slave architecture. The master node is responsible for processing read and write requests, and the slave node,  which is a copy of master node is responsible for processing read-only requests.

In terms of implementation, we currently intend to use the raft protocol to ensure the data consistency of each replica node.

I have created an issue on jira at [1]. If you have any suggestion, please comment on jira or reply to this email.

[1]. https://issues.apache.org/jira/browse/IOTDB-68




Thanks

XuYi

Re: IoTDB supports distributed version

Posted by Xiangdong Huang <sa...@gmail.com>.
Hi,

IoTDB/TsFile can run on a device (edge side), and in a data center.

Firstly, when we say the distributed version, we refer to the latter. If we
run IoTDB/TsFile on the edge, our current design is not for letting these
IoTDB instances forming a cluster. What we need to support is letting these
instances sending their TsFiles to a data center easily, and let the IoTDB
instance (or a Spark instance) loading these data files easily.

Secondly, the distributed version (in a data center) is for supporting
millions of time series and trillions of data points in a cluster, for
managing data from millions of devices, managing long-term time series
data, etc..

In many use cases (devices generate time series data) that I have seen,
weak consistency is ok because applications can bear losing some data...
because,
- Even though we guarantee no data lost on the server side,  data losing
may occur in the transmission process...
- Sometimes the data collection frequency is high (maybe 100 points per
second), and data changes little in a short time range. So, losing some
data points is ok.

In these cases,  weak consistency is ok...

But, there may be some businesses that do not allow that. For example,
- collecting data is expensive, so that I do not want to lose anything.
- each data point is valuable, how can you guarantee the outlier point is
not lost....

And then, strong consistency is desired. For example,
- users want to find the outlier data points.
- users want to get the aggregation value per day. In this case, a precise
result is desired.

As julian wanted, it is GREAT if we can collect some real use cases....

Anyway, I think a tunable consistency is suitable. When implementing a
tunable consistency, Quorum-based consistency like Cassandra is ok: just
writing data into 1/2+1 replicas successfully. But in my opinion, reading
data from multiple replicas and merging them is time-consuming (Cassandra
does like this)...

So, I prefer the following two implementations,
- For strong consistency, Raft is ok. But I do not know how much it
sacrifices the performance...
- If we do not allow strict strong consistency, Raft is over-thinking for
us: as discussed in previous mails, we do not need to allow all nodes
inserting data in the same order, because it is hard that two clients write
data points which belong to the same device and sensor and have the same
timestamp. And, IoTDB can organize the data in the time order when it
flushes data on disk. In this way, Cassandra's write process is fine for
us. But it is hard to repair data for a replica (Cassandra's replica repair
process is too slow...) And in this way, maybe reading data from just one
node is enough (this is different with Cassandra, because Cassandra let
users make a choice..)

Then,
- If we need to allow all data replica are the same, choose strong
consistency and the first implementation.
- If we are in pursuit of saving all data no matter how fast the data
arrives, a weak consistency is ok and use the second implementation.
(because we do not read data from multiple replicas, this implementation is
quite similar the first one but easier).

As for how to partition data, I think we can do like this:
- Data can be partitioned according to their storage group and time.  A
storage group manages several devices (which have the same prefix path).
For example, data generated by Volkswagen belongs to a storage group s1,
and data generated by BMW belongs to another storage group s2. Each data
partition stores one month's data. So, data in s1 from 2019.1-2019.2
belongs to a partition (a node), data in s1 from 2010.02-2019.03  belongs
to another partition. So do data in s2.
- Data can be partitioned according to Consistency hashing (like
Cassandra), which saving the time of searching lookup table.

So, the definition of storage groups is synced around the cluster. keeping
sync is critical for the cluster so that Raft is suitable  (because it is
possible that two clients create a same storage group in concurrency). We
can call this Raft as metadata-raft group. All nodes are metadata holder.

Suppose the replica factor is 3. Then according to the location on the
consistency hash ring, 3 nodes will form a data group. A data group may
holds the time series in a particular time range  in several storage group.
That is, these 3 nodes are the data holder for  some time series in a
particular time range in several storage group.

 - It is better that do not let all nodes know all the time series
definitions in detail. A data group just knows the time series that belong
to the corresponding storage groups. This is a very important character,
because the memory cost of each node is independent with the total metadata
size. But... if we partition data not only by the storage group name but
also the time range, it seems that we have to let all nodes know the
definitions of all time series....

The read/write process is:

- When a client sends creating storage group command, A raft task runs
amoud all metadata holders (i.e., all nodes).
- When a client sends creating a time series or writing data command, the
command will be forwarded to a data group. The data group behaves as a
blank box for other nodes. In the data group, we can run Raft for strong
consistency, or we can just run a quorum-based process for weak consistency.
- When a client sends a query command,  the command will be forwarded to
several data groups (e.g., select m1 from root.* will get data from all
storage groups). Each data group also behaves as a blank box. In a data
group, we can run Raft read for strong consistency (choice 1, just reading
data from the leader node; choice 2, read data from any node, but the node
has to block until it catches up with the leader); Or, we can read data
from any node (like Cassandra with Consistency.ONE).

That is what I am thinking.... glad to discuss about that :p

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Julian Feinauer <j....@pragmaticminds.de> 于2019年4月1日周一 上午2:54写道:

> Hi Felix,
>
> yes, I see, this, indeed would feature the C.
> As we are usually very "edge" focused we do not care that much about the P
> as we have no or very few replications, so no big danger of splits.
> But it is questionable in this situation whether one wants ot use the
> Server or sticks to the tsfile API directly.
>
> Generally speaking I think the discussion on how to distribute these
> values.
> One possibility is (as stated by XuYi) could be to have a master-slave
> approach, where the master keeps all the data and clients have the replicas.
> In this scenario, consistency could be modelled fairly easy, with
> different modes (weak, strong).
> Another approach could be to distribute all reads / writes according to a
> "token ring" as Cassandra does it based on a device id or so.
>
> I guess it really depends on the use cases where one or the other makes
> sense, so I like the discussion to get a good overview about the use cases
> and scenarios to find the best solution.
>
> Julian
>
> Am 31.03.19, 20:46 schrieb "Felix Cheung" <fe...@hotmail.com>:
>
>     The use case I’m thinking about for time series data is a bit
> sensitive for data loss. Suppose for transaction records.
>
>     I think I’d generally agree on A in CAP too. But is it going to be
> eventual consistency? Or it could split brain and lose data?
>
>
>
>     ________________________________
>     From: Julian Feinauer <j....@pragmaticminds.de>
>     Sent: Sunday, March 31, 2019 11:34 AM
>     To: dev@iotdb.apache.org
>     Subject: Re: IoTDB supports distributed version
>
>     Hi Felix,
>
>     could you elaborate a bit on your use cases?
>     I am a bit unsure about the consistency, so it would be interesting to
> hear where you see the important points.
>
>     Thanks!
>     Julian
>
>     Am 31.03.19, 20:25 schrieb "Felix Cheung" <fe...@hotmail.com>:
>
>     I, on the other hand, would be very interested in the strong
> consistency option.
>
>     (Very cool discussion!)
>
>
>     ________________________________
>     From: Julian Feinauer <j....@pragmaticminds.de>
>     Sent: Thursday, March 28, 2019 1:10 AM
>     To: dev@iotdb.apache.org
>     Subject: Re: IoTDB supports distributed version
>
>     Hi,
>
>     this is a very interesting (and important) question.
>     I think we should really consider what we can skip (from an
> application perspective) and what to keep.
>     Perhaps a Token Ring architecture like Cassandra uses could also be a
> good fit, if we hash on the device id or something.
>     At least in the situations and use cases I know (strong) consistency
> is not soo important.
>
>     From a CAP perspective, for me, Availability is the only undiscussable
> necessary thing... for the others... we can discuss : )
>
>     Julian
>
>     PS.: Perhaps it would be beneficial to create a design doc in
> confluence?
>
>     Am 28.03.19, 08:57 schrieb "Xiangdong Huang" <sa...@gmail.com>:
>
>     yep, I think the cluster is in P2P mode when they startup. Then a
> leader
>     election algorithm will change the cluster into the M/S mode (RAFT
>     algorithm is qualified). If the master is down, a new master can be
> elected
>     and lead the cluster.
>
>     By the way, we need to consider the cost of keeping strong consistency
> of
>     data. As time series data in IoT scenario is hard to conflict with each
>     other ( I mean, user1 sends data point (t1, v1) that represents device
> 1
>     and sensor 1, meanwhile user2 sends a data point (t2, v2) that
>     represents the same device and sensor and t2=t1). So, supporting
> multiple
>     consistency level is better for keeping high write performance.
>
>     Best,
>
>     -----------------------------------
>     Xiangdong Huang
>     School of Software, Tsinghua University
>
>     黄向东
>     清华大学 软件学院
>
>
>     Julian Feinauer <j....@pragmaticminds.de> 于2019年3月28日周四 下午3:21写道:
>
>     > Hi XuYi,
>     >
>     > I like the idea but I'm unsure if I like the master / slave approach.
>     > We often deal with "Shopfloor" Scenarios where the setup for the
> Database
>     > is basically "MultiMaster", because we need to sync data one the one
> hand,
>     > but if a system goes down, everything else should keep working.
>     > Would this be possible with your approach?
>     > Something like leader re-election with Zookeper (or better Curator?).
>     > What exactly are the use cases you have in mind?
>     >
>     > Thanks!
>     > Julian
>     >
>     > Am 28.03.19, 05:32 schrieb "徐毅" <xu...@126.com>:
>     >
>     >
>     >
>     >
>     > Hi,
>     >
>     >
>     >
>     >
>     > IoTDB only supports stand-alone version now. We plan to develop
>     > distributed version in next two months.
>     >
>     > We initially decided to use the master-slave architecture. The master
>     > node is responsible for processing read and write requests, and the
> slave
>     > node, which is a copy of master node is responsible for processing
>     > read-only requests.
>     >
>     > In terms of implementation, we currently intend to use the raft
>     > protocol to ensure the data consistency of each replica node.
>     >
>     > I have created an issue on jira at [1]. If you have any suggestion,
>     > please comment on jira or reply to this email.
>     >
>     > [1]. https://issues.apache.org/jira/browse/IOTDB-68
>     >
>     >
>     >
>     >
>     > Thanks
>     >
>     > XuYi
>     >
>     >
>
>
>
>
>
>
>

Re: IoTDB supports distributed version

Posted by Julian Feinauer <j....@pragmaticminds.de>.
Hi Felix,

yes, I see, this, indeed would feature the C.
As we are usually very "edge" focused we do not care that much about the P as we have no or very few replications, so no big danger of splits.
But it is questionable in this situation whether one wants ot use the Server or sticks to the tsfile API directly.

Generally speaking I think the discussion on how to distribute these values.
One possibility is (as stated by XuYi) could be to have a master-slave approach, where the master keeps all the data and clients have the replicas.
In this scenario, consistency could be modelled fairly easy, with different modes (weak, strong).
Another approach could be to distribute all reads / writes according to a "token ring" as Cassandra does it based on a device id or so.

I guess it really depends on the use cases where one or the other makes sense, so I like the discussion to get a good overview about the use cases and scenarios to find the best solution.

Julian

Am 31.03.19, 20:46 schrieb "Felix Cheung" <fe...@hotmail.com>:

    The use case I’m thinking about for time series data is a bit sensitive for data loss. Suppose for transaction records.
    
    I think I’d generally agree on A in CAP too. But is it going to be eventual consistency? Or it could split brain and lose data?
    
    
    
    ________________________________
    From: Julian Feinauer <j....@pragmaticminds.de>
    Sent: Sunday, March 31, 2019 11:34 AM
    To: dev@iotdb.apache.org
    Subject: Re: IoTDB supports distributed version
    
    Hi Felix,
    
    could you elaborate a bit on your use cases?
    I am a bit unsure about the consistency, so it would be interesting to hear where you see the important points.
    
    Thanks!
    Julian
    
    Am 31.03.19, 20:25 schrieb "Felix Cheung" <fe...@hotmail.com>:
    
    I, on the other hand, would be very interested in the strong consistency option.
    
    (Very cool discussion!)
    
    
    ________________________________
    From: Julian Feinauer <j....@pragmaticminds.de>
    Sent: Thursday, March 28, 2019 1:10 AM
    To: dev@iotdb.apache.org
    Subject: Re: IoTDB supports distributed version
    
    Hi,
    
    this is a very interesting (and important) question.
    I think we should really consider what we can skip (from an application perspective) and what to keep.
    Perhaps a Token Ring architecture like Cassandra uses could also be a good fit, if we hash on the device id or something.
    At least in the situations and use cases I know (strong) consistency is not soo important.
    
    From a CAP perspective, for me, Availability is the only undiscussable necessary thing... for the others... we can discuss : )
    
    Julian
    
    PS.: Perhaps it would be beneficial to create a design doc in confluence?
    
    Am 28.03.19, 08:57 schrieb "Xiangdong Huang" <sa...@gmail.com>:
    
    yep, I think the cluster is in P2P mode when they startup. Then a leader
    election algorithm will change the cluster into the M/S mode (RAFT
    algorithm is qualified). If the master is down, a new master can be elected
    and lead the cluster.
    
    By the way, we need to consider the cost of keeping strong consistency of
    data. As time series data in IoT scenario is hard to conflict with each
    other ( I mean, user1 sends data point (t1, v1) that represents device 1
    and sensor 1, meanwhile user2 sends a data point (t2, v2) that
    represents the same device and sensor and t2=t1). So, supporting multiple
    consistency level is better for keeping high write performance.
    
    Best,
    
    -----------------------------------
    Xiangdong Huang
    School of Software, Tsinghua University
    
    黄向东
    清华大学 软件学院
    
    
    Julian Feinauer <j....@pragmaticminds.de> 于2019年3月28日周四 下午3:21写道:
    
    > Hi XuYi,
    >
    > I like the idea but I'm unsure if I like the master / slave approach.
    > We often deal with "Shopfloor" Scenarios where the setup for the Database
    > is basically "MultiMaster", because we need to sync data one the one hand,
    > but if a system goes down, everything else should keep working.
    > Would this be possible with your approach?
    > Something like leader re-election with Zookeper (or better Curator?).
    > What exactly are the use cases you have in mind?
    >
    > Thanks!
    > Julian
    >
    > Am 28.03.19, 05:32 schrieb "徐毅" <xu...@126.com>:
    >
    >
    >
    >
    > Hi,
    >
    >
    >
    >
    > IoTDB only supports stand-alone version now. We plan to develop
    > distributed version in next two months.
    >
    > We initially decided to use the master-slave architecture. The master
    > node is responsible for processing read and write requests, and the slave
    > node, which is a copy of master node is responsible for processing
    > read-only requests.
    >
    > In terms of implementation, we currently intend to use the raft
    > protocol to ensure the data consistency of each replica node.
    >
    > I have created an issue on jira at [1]. If you have any suggestion,
    > please comment on jira or reply to this email.
    >
    > [1]. https://issues.apache.org/jira/browse/IOTDB-68
    >
    >
    >
    >
    > Thanks
    >
    > XuYi
    >
    >
    
    
    
    
    


Re: IoTDB supports distributed version

Posted by Felix Cheung <fe...@hotmail.com>.
The use case I’m thinking about for time series data is a bit sensitive for data loss. Suppose for transaction records.

I think I’d generally agree on A in CAP too. But is it going to be eventual consistency? Or it could split brain and lose data?



________________________________
From: Julian Feinauer <j....@pragmaticminds.de>
Sent: Sunday, March 31, 2019 11:34 AM
To: dev@iotdb.apache.org
Subject: Re: IoTDB supports distributed version

Hi Felix,

could you elaborate a bit on your use cases?
I am a bit unsure about the consistency, so it would be interesting to hear where you see the important points.

Thanks!
Julian

Am 31.03.19, 20:25 schrieb "Felix Cheung" <fe...@hotmail.com>:

I, on the other hand, would be very interested in the strong consistency option.

(Very cool discussion!)


________________________________
From: Julian Feinauer <j....@pragmaticminds.de>
Sent: Thursday, March 28, 2019 1:10 AM
To: dev@iotdb.apache.org
Subject: Re: IoTDB supports distributed version

Hi,

this is a very interesting (and important) question.
I think we should really consider what we can skip (from an application perspective) and what to keep.
Perhaps a Token Ring architecture like Cassandra uses could also be a good fit, if we hash on the device id or something.
At least in the situations and use cases I know (strong) consistency is not soo important.

From a CAP perspective, for me, Availability is the only undiscussable necessary thing... for the others... we can discuss : )

Julian

PS.: Perhaps it would be beneficial to create a design doc in confluence?

Am 28.03.19, 08:57 schrieb "Xiangdong Huang" <sa...@gmail.com>:

yep, I think the cluster is in P2P mode when they startup. Then a leader
election algorithm will change the cluster into the M/S mode (RAFT
algorithm is qualified). If the master is down, a new master can be elected
and lead the cluster.

By the way, we need to consider the cost of keeping strong consistency of
data. As time series data in IoT scenario is hard to conflict with each
other ( I mean, user1 sends data point (t1, v1) that represents device 1
and sensor 1, meanwhile user2 sends a data point (t2, v2) that
represents the same device and sensor and t2=t1). So, supporting multiple
consistency level is better for keeping high write performance.

Best,

-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

黄向东
清华大学 软件学院


Julian Feinauer <j....@pragmaticminds.de> 于2019年3月28日周四 下午3:21写道:

> Hi XuYi,
>
> I like the idea but I'm unsure if I like the master / slave approach.
> We often deal with "Shopfloor" Scenarios where the setup for the Database
> is basically "MultiMaster", because we need to sync data one the one hand,
> but if a system goes down, everything else should keep working.
> Would this be possible with your approach?
> Something like leader re-election with Zookeper (or better Curator?).
> What exactly are the use cases you have in mind?
>
> Thanks!
> Julian
>
> Am 28.03.19, 05:32 schrieb "徐毅" <xu...@126.com>:
>
>
>
>
> Hi,
>
>
>
>
> IoTDB only supports stand-alone version now. We plan to develop
> distributed version in next two months.
>
> We initially decided to use the master-slave architecture. The master
> node is responsible for processing read and write requests, and the slave
> node, which is a copy of master node is responsible for processing
> read-only requests.
>
> In terms of implementation, we currently intend to use the raft
> protocol to ensure the data consistency of each replica node.
>
> I have created an issue on jira at [1]. If you have any suggestion,
> please comment on jira or reply to this email.
>
> [1]. https://issues.apache.org/jira/browse/IOTDB-68
>
>
>
>
> Thanks
>
> XuYi
>
>





Re: IoTDB supports distributed version

Posted by Julian Feinauer <j....@pragmaticminds.de>.
Hi Felix,

could you elaborate a bit on your use cases?
I am a bit unsure about the consistency, so it would be interesting to hear where you see the important points.

Thanks!
Julian

Am 31.03.19, 20:25 schrieb "Felix Cheung" <fe...@hotmail.com>:

    I, on the other hand, would be very interested in the strong consistency option.
    
    (Very cool discussion!)
    
    
    ________________________________
    From: Julian Feinauer <j....@pragmaticminds.de>
    Sent: Thursday, March 28, 2019 1:10 AM
    To: dev@iotdb.apache.org
    Subject: Re: IoTDB supports distributed version
    
    Hi,
    
    this is a very interesting (and important) question.
    I think we should really consider what we can skip (from an application perspective) and what to keep.
    Perhaps a Token Ring architecture like Cassandra uses could also be a good fit, if we hash on the device id or something.
    At least in the situations and use cases I know (strong) consistency is not soo important.
    
    From a CAP perspective, for me, Availability is the only undiscussable necessary thing... for the others... we can discuss : )
    
    Julian
    
    PS.: Perhaps it would be beneficial to create a design doc in confluence?
    
    Am 28.03.19, 08:57 schrieb "Xiangdong Huang" <sa...@gmail.com>:
    
    yep, I think the cluster is in P2P mode when they startup. Then a leader
    election algorithm will change the cluster into the M/S mode (RAFT
    algorithm is qualified). If the master is down, a new master can be elected
    and lead the cluster.
    
    By the way, we need to consider the cost of keeping strong consistency of
    data. As time series data in IoT scenario is hard to conflict with each
    other ( I mean, user1 sends data point (t1, v1) that represents device 1
    and sensor 1, meanwhile user2 sends a data point (t2, v2) that
    represents the same device and sensor and t2=t1). So, supporting multiple
    consistency level is better for keeping high write performance.
    
    Best,
    
    -----------------------------------
    Xiangdong Huang
    School of Software, Tsinghua University
    
    黄向东
    清华大学 软件学院
    
    
    Julian Feinauer <j....@pragmaticminds.de> 于2019年3月28日周四 下午3:21写道:
    
    > Hi XuYi,
    >
    > I like the idea but I'm unsure if I like the master / slave approach.
    > We often deal with "Shopfloor" Scenarios where the setup for the Database
    > is basically "MultiMaster", because we need to sync data one the one hand,
    > but if a system goes down, everything else should keep working.
    > Would this be possible with your approach?
    > Something like leader re-election with Zookeper (or better Curator?).
    > What exactly are the use cases you have in mind?
    >
    > Thanks!
    > Julian
    >
    > Am 28.03.19, 05:32 schrieb "徐毅" <xu...@126.com>:
    >
    >
    >
    >
    > Hi,
    >
    >
    >
    >
    > IoTDB only supports stand-alone version now. We plan to develop
    > distributed version in next two months.
    >
    > We initially decided to use the master-slave architecture. The master
    > node is responsible for processing read and write requests, and the slave
    > node, which is a copy of master node is responsible for processing
    > read-only requests.
    >
    > In terms of implementation, we currently intend to use the raft
    > protocol to ensure the data consistency of each replica node.
    >
    > I have created an issue on jira at [1]. If you have any suggestion,
    > please comment on jira or reply to this email.
    >
    > [1]. https://issues.apache.org/jira/browse/IOTDB-68
    >
    >
    >
    >
    > Thanks
    >
    > XuYi
    >
    >
    
    
    


Re: IoTDB supports distributed version

Posted by Felix Cheung <fe...@hotmail.com>.
I, on the other hand, would be very interested in the strong consistency option.

(Very cool discussion!)


________________________________
From: Julian Feinauer <j....@pragmaticminds.de>
Sent: Thursday, March 28, 2019 1:10 AM
To: dev@iotdb.apache.org
Subject: Re: IoTDB supports distributed version

Hi,

this is a very interesting (and important) question.
I think we should really consider what we can skip (from an application perspective) and what to keep.
Perhaps a Token Ring architecture like Cassandra uses could also be a good fit, if we hash on the device id or something.
At least in the situations and use cases I know (strong) consistency is not soo important.

From a CAP perspective, for me, Availability is the only undiscussable necessary thing... for the others... we can discuss : )

Julian

PS.: Perhaps it would be beneficial to create a design doc in confluence?

Am 28.03.19, 08:57 schrieb "Xiangdong Huang" <sa...@gmail.com>:

yep, I think the cluster is in P2P mode when they startup. Then a leader
election algorithm will change the cluster into the M/S mode (RAFT
algorithm is qualified). If the master is down, a new master can be elected
and lead the cluster.

By the way, we need to consider the cost of keeping strong consistency of
data. As time series data in IoT scenario is hard to conflict with each
other ( I mean, user1 sends data point (t1, v1) that represents device 1
and sensor 1, meanwhile user2 sends a data point (t2, v2) that
represents the same device and sensor and t2=t1). So, supporting multiple
consistency level is better for keeping high write performance.

Best,

-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

黄向东
清华大学 软件学院


Julian Feinauer <j....@pragmaticminds.de> 于2019年3月28日周四 下午3:21写道:

> Hi XuYi,
>
> I like the idea but I'm unsure if I like the master / slave approach.
> We often deal with "Shopfloor" Scenarios where the setup for the Database
> is basically "MultiMaster", because we need to sync data one the one hand,
> but if a system goes down, everything else should keep working.
> Would this be possible with your approach?
> Something like leader re-election with Zookeper (or better Curator?).
> What exactly are the use cases you have in mind?
>
> Thanks!
> Julian
>
> Am 28.03.19, 05:32 schrieb "徐毅" <xu...@126.com>:
>
>
>
>
> Hi,
>
>
>
>
> IoTDB only supports stand-alone version now. We plan to develop
> distributed version in next two months.
>
> We initially decided to use the master-slave architecture. The master
> node is responsible for processing read and write requests, and the slave
> node, which is a copy of master node is responsible for processing
> read-only requests.
>
> In terms of implementation, we currently intend to use the raft
> protocol to ensure the data consistency of each replica node.
>
> I have created an issue on jira at [1]. If you have any suggestion,
> please comment on jira or reply to this email.
>
> [1]. https://issues.apache.org/jira/browse/IOTDB-68
>
>
>
>
> Thanks
>
> XuYi
>
>



Re: IoTDB supports distributed version

Posted by Julian Feinauer <j....@pragmaticminds.de>.
Hi,

this is a very interesting (and important) question.
I think we should really consider what we can skip (from an application perspective) and what to keep.
Perhaps a Token Ring architecture like Cassandra uses could also be a good fit, if we hash on the device id or something.
At least in the situations and use cases I know (strong) consistency is not soo important.

From a CAP perspective, for me, Availability is the only undiscussable necessary thing... for the others... we can discuss : )

Julian

PS.: Perhaps it would be beneficial to create a design doc in confluence?

Am 28.03.19, 08:57 schrieb "Xiangdong Huang" <sa...@gmail.com>:

    yep, I think the cluster is in P2P mode when they startup. Then a leader
    election algorithm will change the cluster into the M/S mode (RAFT
    algorithm is qualified). If the master is down, a new master can be elected
    and lead the cluster.
    
    By the way, we need to consider the cost of keeping strong consistency of
    data.  As time series data in IoT scenario is hard to conflict with each
    other ( I mean, user1 sends data point  (t1, v1) that represents device 1
    and sensor 1, meanwhile  user2 sends a data point (t2, v2) that
    represents the same device and sensor  and t2=t1). So, supporting multiple
    consistency level is better for keeping high write performance.
    
    Best,
    
    -----------------------------------
    Xiangdong Huang
    School of Software, Tsinghua University
    
     黄向东
    清华大学 软件学院
    
    
    Julian Feinauer <j....@pragmaticminds.de> 于2019年3月28日周四 下午3:21写道:
    
    > Hi XuYi,
    >
    > I like the idea but I'm unsure if I like the master / slave approach.
    > We often deal with "Shopfloor" Scenarios where the setup for the Database
    > is basically "MultiMaster", because we need to sync data one the one hand,
    > but if a system goes down, everything else should keep working.
    > Would this be possible with your approach?
    > Something like leader re-election with Zookeper (or better Curator?).
    > What exactly are the use cases you have in mind?
    >
    > Thanks!
    > Julian
    >
    > Am 28.03.19, 05:32 schrieb "徐毅" <xu...@126.com>:
    >
    >
    >
    >
    >     Hi,
    >
    >
    >
    >
    >     IoTDB only supports stand-alone version now. We plan to develop
    > distributed version in next two months.
    >
    >     We initially decided to use the master-slave architecture. The master
    > node is responsible for processing read and write requests, and the slave
    > node,  which is a copy of master node is responsible for processing
    > read-only requests.
    >
    >     In terms of implementation, we currently intend to use the raft
    > protocol to ensure the data consistency of each replica node.
    >
    >     I have created an issue on jira at [1]. If you have any suggestion,
    > please comment on jira or reply to this email.
    >
    >     [1]. https://issues.apache.org/jira/browse/IOTDB-68
    >
    >
    >
    >
    >     Thanks
    >
    >     XuYi
    >
    >
    


Re: IoTDB supports distributed version

Posted by Xiangdong Huang <sa...@gmail.com>.
yep, I think the cluster is in P2P mode when they startup. Then a leader
election algorithm will change the cluster into the M/S mode (RAFT
algorithm is qualified). If the master is down, a new master can be elected
and lead the cluster.

By the way, we need to consider the cost of keeping strong consistency of
data.  As time series data in IoT scenario is hard to conflict with each
other ( I mean, user1 sends data point  (t1, v1) that represents device 1
and sensor 1, meanwhile  user2 sends a data point (t2, v2) that
represents the same device and sensor  and t2=t1). So, supporting multiple
consistency level is better for keeping high write performance.

Best,

-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Julian Feinauer <j....@pragmaticminds.de> 于2019年3月28日周四 下午3:21写道:

> Hi XuYi,
>
> I like the idea but I'm unsure if I like the master / slave approach.
> We often deal with "Shopfloor" Scenarios where the setup for the Database
> is basically "MultiMaster", because we need to sync data one the one hand,
> but if a system goes down, everything else should keep working.
> Would this be possible with your approach?
> Something like leader re-election with Zookeper (or better Curator?).
> What exactly are the use cases you have in mind?
>
> Thanks!
> Julian
>
> Am 28.03.19, 05:32 schrieb "徐毅" <xu...@126.com>:
>
>
>
>
>     Hi,
>
>
>
>
>     IoTDB only supports stand-alone version now. We plan to develop
> distributed version in next two months.
>
>     We initially decided to use the master-slave architecture. The master
> node is responsible for processing read and write requests, and the slave
> node,  which is a copy of master node is responsible for processing
> read-only requests.
>
>     In terms of implementation, we currently intend to use the raft
> protocol to ensure the data consistency of each replica node.
>
>     I have created an issue on jira at [1]. If you have any suggestion,
> please comment on jira or reply to this email.
>
>     [1]. https://issues.apache.org/jira/browse/IOTDB-68
>
>
>
>
>     Thanks
>
>     XuYi
>
>

Re: IoTDB supports distributed version

Posted by Julian Feinauer <j....@pragmaticminds.de>.
Hi XuYi,

I like the idea but I'm unsure if I like the master / slave approach.
We often deal with "Shopfloor" Scenarios where the setup for the Database is basically "MultiMaster", because we need to sync data one the one hand, but if a system goes down, everything else should keep working.
Would this be possible with your approach?
Something like leader re-election with Zookeper (or better Curator?).
What exactly are the use cases you have in mind?

Thanks!
Julian

Am 28.03.19, 05:32 schrieb "徐毅" <xu...@126.com>:

    
    
    
    Hi,
    
    
    
    
    IoTDB only supports stand-alone version now. We plan to develop distributed version in next two months. 
    
    We initially decided to use the master-slave architecture. The master node is responsible for processing read and write requests, and the slave node,  which is a copy of master node is responsible for processing read-only requests.
    
    In terms of implementation, we currently intend to use the raft protocol to ensure the data consistency of each replica node.
    
    I have created an issue on jira at [1]. If you have any suggestion, please comment on jira or reply to this email.
    
    [1]. https://issues.apache.org/jira/browse/IOTDB-68
    
    
    
    
    Thanks
    
    XuYi