You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@iotdb.apache.org by 程建云 <ch...@360.cn> on 2021/11/24 09:38:22 UTC

About making an iterable cluster version in enterprise

Hi, all

After running cluster version with online stream about two weeks, we experienced two times of failures that cluster is no response and can't recover by restarting. And we didn't find an effective way to recover data from cluster. So we'd like to make testable cluster version in enterprise which should have the properties:

1. Write operation won’t be blocked frequently.
2. Query bugs are tolerant as it could be fixed and iterate quickly.
3. Most of issues could be resolve by restart nodes or cluster.
4. Exist a solution to solve the unrecoverable issue after lose small part of data.
5. Cluster restart could complete in a proper time.
6. System has monitor mechanism.
We’re planning improve from below aspects:

1. Meta data use too much memory

In our scenario, the measurement scale is large which would be around 1 billion but we have small data point ingestion (100K per second). We found the cluster node can’t afford the metadata storage as memory limitation(each nodes has 256G memory). As the small data point request rate, the CPU load is only about 1% ~ 2%. For the scenario, we intended to import some 3rd party storage component like RocksDB to help manage schema meta data. Of course, this would be optional and can be configured.

1. Raft implementation

For this one, we planned to make it two steps. First, we’d like to abstract the interfaces of Raft, try to make Raft as a independent component. This should also be one work item when implement new architecture. Second, we’d like to import some 3rd party Raft library like Ratis and make it configurable ideally.

1. Engineering components

Cluster missed some components like monitor system(this one should be working in progress by community, we’d like to help if needed), migration single node data into cluster which would help migrate single node to cluster and tools to help do failure recovery. We need to make these tools to make the system observable and recoverable.

1. Test

As new test architecture is importing into community, we would try to complement test cases under new architecture.

Most of the solutions above are not investigate deeply, any idea is welcomed.

What’s the benefit of the work?
We intend to make the version run on production so that we can collect feedback/bugs from real user and iterate by that. And finally become a baseline of stable cluster version.

Why won’t make it in new architecture?
We don’t do this under new architecture because the new architecture just started planning and we can’t wait anymore. And nearly all of the work doesn’t conflict with new architecture and could be usable in new architecture.
Please feel free to reply the email to discussion if you have any concern or idea.

Welcome to discuss if you have any concern.

----------------------------------------------------------
Thanks!
Jianyun Cheng

回复： About making an iterable cluster version in enterprise

Posted by CloudWise-Luke <28...@qq.com.INVALID>.

+1&nbsp;maintain it in rel/0.12 branch

CloudWiseluke.miao

&nbsp;

------------------&nbsp;原始邮件&nbsp;------------------
发件人: "dev" <ccgowork@163.com&gt;;
发送时间:&nbsp;2021年11月25日(星期四) 上午9:12
收件人:&nbsp;"dev@iotdb.apache.org"<dev@iotdb.apache.org&gt;;

主题:&nbsp;Re: About making an iterable cluster version in enterprise

+1， and I suggest to&nbsp; maintain it in rel/0.12 branch for it's stable now.

Thanks!

Chao Wang
BONC ltd
ccgowork@163.com
On 11/25/2021 09:03，Houliang Qi<neuyilan@163.com&gt; wrote：
Hi，
We also have some similar situations in the test environment. We very much support creating a stable and available version based on the current version.

Thanks,
---------------------------------------
Houliang Qi
BONC, Ltd

On 11/24/2021 21:29，Xiangdong Huang<sainthxd@gmail.com&gt; wrote：
Hi,
I think it is fine to keep maintaining the current cluster module,
either on rel/0.12 and master branch.
Look forward to progress.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

黄向东
清华大学 软件学院

程建云 <chengjianyun@360.cn&gt; 于2021年11月24日周三 下午5:38写道：

Hi, all

1.&nbsp; Write operation won’t be blocked frequently.
2.&nbsp; Query bugs are tolerant as it could be fixed and iterate quickly.
3.&nbsp; Most of issues could be resolve by restart nodes or cluster.
4.&nbsp; Exist a solution to solve the unrecoverable issue after lose small part of data.
5.&nbsp; Cluster restart could complete in a proper time.
6.&nbsp; System has monitor mechanism.
We’re planning improve from below aspects:

1.&nbsp; Meta data use too much memory

In our scenario, the measurement scale is large which would be around 1 billion but we have small data point ingestion (100K per second). We found the cluster node can’t afford the metadata storage as memory limitation(each nodes has 256G memory).&nbsp; As the small data point request rate, the CPU load is only about 1% ~ 2%. For the scenario, we intended to import some 3rd party storage component like RocksDB to help manage schema meta data. Of course, this would be optional and can be configured.

1.&nbsp; Raft implementation

1.&nbsp; Engineering components

1.&nbsp; Test

As new test architecture is importing into community, we would try to complement test cases under new architecture.

Most of the solutions above are not investigate deeply, any idea is welcomed.

Welcome to discuss if you have any concern.

----------------------------------------------------------
Thanks!
Jianyun Cheng

Re: About making an iterable cluster version in enterprise

Posted by Chao Wang <cc...@163.com>.

+1， and I suggest to maintain it in rel/0.12 branch for it's stable now.

Thanks!

Chao Wang
BONC ltd
ccgowork@163.com
On 11/25/2021 09:03，Houliang Qi<ne...@163.com> wrote：
Hi，
We also have some similar situations in the test environment. We very much support creating a stable and available version based on the current version.

Thanks,
---------------------------------------
Houliang Qi
BONC, Ltd

On 11/24/2021 21:29，Xiangdong Huang<sa...@gmail.com> wrote：
Hi,
I think it is fine to keep maintaining the current cluster module,
either on rel/0.12 and master branch.
Look forward to progress.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

黄向东
清华大学 软件学院

程建云 <ch...@360.cn> 于2021年11月24日周三 下午5:38写道：

Hi, all

1. Meta data use too much memory

1. Raft implementation

1. Engineering components

1. Test

As new test architecture is importing into community, we would try to complement test cases under new architecture.

Most of the solutions above are not investigate deeply, any idea is welcomed.

Welcome to discuss if you have any concern.

----------------------------------------------------------
Thanks!
Jianyun Cheng

Re: About making an iterable cluster version in enterprise

Posted by Houliang Qi <ne...@163.com>.

Hi，
We also have some similar situations in the test environment. We very much support creating a stable and available version based on the current version.

Thanks,
---------------------------------------
Houliang Qi
BONC, Ltd

On 11/24/2021 21:29，Xiangdong Huang<sa...@gmail.com> wrote：
Hi,
I think it is fine to keep maintaining the current cluster module,
either on rel/0.12 and master branch.
Look forward to progress.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

黄向东
清华大学 软件学院

程建云 <ch...@360.cn> 于2021年11月24日周三 下午5:38写道：

Hi, all

1. Meta data use too much memory

1. Raft implementation

1. Engineering components

1. Test

As new test architecture is importing into community, we would try to complement test cases under new architecture.

Most of the solutions above are not investigate deeply, any idea is welcomed.

Welcome to discuss if you have any concern.

----------------------------------------------------------
Thanks!
Jianyun Cheng

Re: About making an iterable cluster version in enterprise

Posted by Xiangdong Huang <sa...@gmail.com>.

Hi,
I think it is fine to keep maintaining the current cluster module,
either on rel/0.12 and master branch.
Look forward to progress.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院

程建云 <ch...@360.cn> 于2021年11月24日周三 下午5:38写道：
>
> Hi, all
>
> After running cluster version with online stream about two weeks, we experienced two times of failures that cluster is no response and can't recover by restarting. And we didn't find an effective way to recover data from cluster. So we'd like to make testable cluster version in enterprise which should have the properties:
>
>
>   1.  Write operation won’t be blocked frequently.
>   2.  Query bugs are tolerant as it could be fixed and iterate quickly.
>   3.  Most of issues could be resolve by restart nodes or cluster.
>   4.  Exist a solution to solve the unrecoverable issue after lose small part of data.
>   5.  Cluster restart could complete in a proper time.
>   6.  System has monitor mechanism.
> We’re planning improve from below aspects:
>
>
>   1.  Meta data use too much memory
>
> In our scenario, the measurement scale is large which would be around 1 billion but we have small data point ingestion (100K per second). We found the cluster node can’t afford the metadata storage as memory limitation(each nodes has 256G memory).  As the small data point request rate, the CPU load is only about 1% ~ 2%. For the scenario, we intended to import some 3rd party storage component like RocksDB to help manage schema meta data. Of course, this would be optional and can be configured.
>
>
>
>   1.  Raft implementation
>
> For this one, we planned to make it two steps. First, we’d like to abstract the interfaces of Raft, try to make Raft as a independent component. This should also be one work item when implement new architecture. Second, we’d like to import some 3rd party Raft library like Ratis and make it configurable ideally.
>
>
>
>   1.  Engineering components
>
> Cluster missed some components like monitor system(this one should be working in progress by community, we’d like to help if needed), migration single node data into cluster which would help migrate single node to cluster and tools to help do failure recovery. We need to make these tools to make the system observable and recoverable.
>
>
>
>   1.  Test
>
> As new test architecture is importing into community, we would try to complement test cases under new architecture.
>
>
> Most of the solutions above are not investigate deeply, any idea is welcomed.
>
> What’s the benefit of the work?
> We intend to make the version run on production so that we can collect feedback/bugs from real user and iterate by that. And finally become a baseline of stable cluster version.
>
> Why won’t make it in new architecture?
> We don’t do this under new architecture because the new architecture just started planning and we can’t wait anymore. And nearly all of the work doesn’t conflict with new architecture and could be usable in new architecture.
> Please feel free to reply the email to discussion if you have any concern or idea.
>
> Welcome to discuss if you have any concern.
>
> ----------------------------------------------------------
> Thanks!
> Jianyun Cheng
>