You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@iotdb.apache.org by "songbinghua@iie.ac.cn" <so...@iie.ac.cn> on 2021/11/10 13:22:07 UTC

回复: RE: What is the next architecture of IoTDB?

This is the features I concerned for the next architecture that I have proposed in my first discussion email:

1. the num of nodes in cluster can support, 
2. the write and read performance in multi-replicas,
3. the linear ratio of read and write in cluster,
4. the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
5. the joint query between time series data and relational data,
6. the federal cluster query across data centers, AZ and regions,
7. the spatio-temporal data analysis in one DB etc…


宋秉华
songbinghua@iie.ac.cn
 
发件人： jianyun cheng
发送时间： 2021-11-10 21:06
收件人： dev@iotdb.apache.org
主题： RE: What is the next architecture of IoTDB?
Totally agree with the point that we should have some new architecture.
 
But before get started, we should know that the challenges really can’t be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry about is when we spend a lot time on the new architecture but finally we just get very a few benefit.
 
I think we can follow the below steps to decide if we need a new architecture or not.
 
 
  1.  List the features expect to support.
  2.  Triage the features, filter out true requirements. (This can be done via vote.)
  3.  Can these true requirements be implemented in the current architecture?
  4.  If the answer in the 3 step is No. Then we really need a new architecture.
 
Architecture evolution is an eternal topic, we should know where we want to go in each step. It would be very great if you can share some ideas so that we could have some discussion.
 
How do you think?
 
----------------------------------------------------------
Jianyun Cheng
Thanks!
 
From: songbinghua@iie.ac.cn<ma...@iie.ac.cn>
Sent: Wednesday, November 10, 2021 10:48 AM
To: dev<ma...@iotdb.apache.org>
Subject: What is the next architecture of IoTDB?
 
What is the next architecture of IoTDB?
 
As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.
 
With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read performance in multi-replicas,
the linear ratio of read and write in cluster,
the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
the joint query between time series data and relational data,
the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…
 
So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.
 
It's time to action, WDYT？
 
 
 
Bruce Song 宋秉华
icloudsong

Re: What is the next architecture of IoTDB?

Posted by Dawei Liu <at...@163.com>.

Hi,


We should also think about how to coordinate with the data center in the edge end and terminal


Dawei Liu
On 11/10/2021 21:22，songbinghua@iie.ac.cn<so...@iie.ac.cn> wrote：
This is the features I concerned for the next architecture that I have proposed in my first discussion email:

1. the num of nodes in cluster can support,
2. the write and read performance in multi-replicas,
3. the linear ratio of read and write in cluster,
4. the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
5. the joint query between time series data and relational data,
6. the federal cluster query across data centers, AZ and regions,
7. the spatio-temporal data analysis in one DB etc…


宋秉华
songbinghua@iie.ac.cn

发件人： jianyun cheng
发送时间： 2021-11-10 21:06
收件人： dev@iotdb.apache.org
主题： RE: What is the next architecture of IoTDB?
Totally agree with the point that we should have some new architecture.

But before get started, we should know that the challenges really can’t be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry about is when we spend a lot time on the new architecture but finally we just get very a few benefit.

I think we can follow the below steps to decide if we need a new architecture or not.


1.  List the features expect to support.
2.  Triage the features, filter out true requirements. (This can be done via vote.)
3.  Can these true requirements be implemented in the current architecture?
4.  If the answer in the 3 step is No. Then we really need a new architecture.

Architecture evolution is an eternal topic, we should know where we want to go in each step. It would be very great if you can share some ideas so that we could have some discussion.

How do you think?

----------------------------------------------------------
Jianyun Cheng
Thanks!

From: songbinghua@iie.ac.cn<ma...@iie.ac.cn>
Sent: Wednesday, November 10, 2021 10:48 AM
To: dev<ma...@iotdb.apache.org>
Subject: What is the next architecture of IoTDB?

What is the next architecture of IoTDB?

As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.

With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read performance in multi-replicas,
the linear ratio of read and write in cluster,
the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
the joint query between time series data and relational data,
the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…

So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.

It's time to action, WDYT？



Bruce Song 宋秉华
icloudsong

Re: What is the next architecture of IoTDB?

Posted by Eric Pai <er...@hotmail.com>.

I may be able to answer your question about the intention of the IT framework, as we have already had benefit from it😊.

The biggest advantage of the IT framework is to REUSE our existing IT cases in different cluster modes. Although the E2E test based on the test-container technology can test in cluster mode, we should migrate, or copy the IT cases from standalone to E2E test. It's hard to maintain and we need to do a lot of duplicated work. Our goal is to let the contributor develops an IT case once which can be used both in standalone, cluster mode and so on. What's more, running E2E test need a docker runtime, which is a little heavier than running directly. Now in order to decrease the time cost, all test
But I should admit that the IT framework has some disadvantages as well. It's good at testing normal client-server communication, but can't test some node failure/restarting scenarios.

For the test boundaries of IT framework and E2E testing, we can make a discussion later.

在 2021/11/11 上午10:02，“谭新宇”<10...@qq.com.INVALID> 写入:

    There's been a lot of talk about cluster new architectures, and stability has been a problem for a long time.

    For the former, I agree that we need to summarize our experience and discuss the problems with our current architecture and possible improvements. But it's not enough just to point out the problem. We need more detailed research and test results. In fact, it seems to me that the current problems are more a result of code implementation than cluster architecture.

    For the latter, I think it has less to do with cluster architecture and more to do with inadequate testing. It is very difficult to stabilize an efficient system and requires a lot of time cost. We've fixed a lot of bugs so far based on the existing architecture. On the negative side, I think these bugs may still appear with the new architecture, and we need to carefully evaluate the workload. I totally agree with the test-driven approach, but with the IT framework mentioned earlier, I wonder how IT is different and related to the current E2E framework? In addition to the cluster bug summary, I made a list on jira (https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fissues%2F%3Ffilter%3D12351069&amp;data=04%7C01%7C%7Cb372d00f1b9744913e2208d9a4b75077%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637721929505858330%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=faDKNED7xRXP21%2Fs62Zm%2Fg4cS3Um6RRWVoyQvAW216s%3D&amp;reserved=0), you can look at it.

    At present, there are more than 10 people working in the cluster module in the community. I think there is still room for improvement in the previous way of collaboration. If everyone has a clear goal and a clear division of labor, I believe we will do a good job.

    But I totally agree with Jianyun.

    > Before get started, We should know that the challenges really can't be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry    > about is when we spend a lot time on the new architecture but finally we just get very a few benefit.`
    > I think we can follow the below steps to decide if we need a new architecture or not.

    > 1.  List the features expect to support.
    > 2.  Triage the features, filter out true requirements. (This can be done via vote.)
    > 3.  Can these true requirements be implemented in the current architecture?
    > 4. If the answer in the 3 steps is No. Then we really need a new architecture.

    So far in the mailing list we've only talked about 1 and 2, not 3 and 4. Let's convince everyone in the community by presenting the next generation of architecture and explaining the advantages of the new architecture over the old one on these issues.


    —————————
    Xinyu Tan

    > 在 2021年11月10日，下午9:22，songbinghua@iie.ac.cn 写道：
    > 
    > This is the features I concerned for the next architecture that I have proposed in my first discussion email:
    > 
    > 1. the num of nodes in cluster can support, 
    > 2. the write and read performance in multi-replicas,
    > 3. the linear ratio of read and write in cluster,
    > 4. the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
    > 5. the joint query between time series data and relational data,
    > 6. the federal cluster query across data centers, AZ and regions,
    > 7. the spatio-temporal data analysis in one DB etc…
    > 
    > 
    > 宋秉华
    > songbinghua@iie.ac.cn
    > 
    > 发件人： jianyun cheng
    > 发送时间： 2021-11-10 21:06
    > 收件人： dev@iotdb.apache.org
    > 主题： RE: What is the next architecture of IoTDB?
    > Totally agree with the point that we should have some new architecture.
    > 
    > But before get started, we should know that the challenges really can’t be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry about is when we spend a lot time on the new architecture but finally we just get very a few benefit.
    > 
    > I think we can follow the below steps to decide if we need a new architecture or not.
    > 
    > 
    >  1.  List the features expect to support.
    >  2.  Triage the features, filter out true requirements. (This can be done via vote.)
    >  3.  Can these true requirements be implemented in the current architecture?
    >  4.  If the answer in the 3 step is No. Then we really need a new architecture.
    > 
    > Architecture evolution is an eternal topic, we should know where we want to go in each step. It would be very great if you can share some ideas so that we could have some discussion.
    > 
    > How do you think?
    > 
    > ----------------------------------------------------------
    > Jianyun Cheng
    > Thanks!
    > 
    > From: songbinghua@iie.ac.cn<ma...@iie.ac.cn>
    > Sent: Wednesday, November 10, 2021 10:48 AM
    > To: dev<ma...@iotdb.apache.org>
    > Subject: What is the next architecture of IoTDB?
    > 
    > What is the next architecture of IoTDB?
    > 
    > As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.
    > 
    > With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
    > the num of nodes in cluster can support, the write and read performance in multi-replicas,
    > the linear ratio of read and write in cluster,
    > the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
    > the joint query between time series data and relational data,
    > the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…
    > 
    > So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
    > The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.
    > 
    > It's time to action, WDYT？
    > 
    > 
    > 
    > Bruce Song 宋秉华
    > icloudsong
    >

Re: What is the next architecture of IoTDB?

Posted by 谭新宇 <10...@qq.com.INVALID>.

There's been a lot of talk about cluster new architectures, and stability has been a problem for a long time.

For the former, I agree that we need to summarize our experience and discuss the problems with our current architecture and possible improvements. But it's not enough just to point out the problem. We need more detailed research and test results. In fact, it seems to me that the current problems are more a result of code implementation than cluster architecture.

For the latter, I think it has less to do with cluster architecture and more to do with inadequate testing. It is very difficult to stabilize an efficient system and requires a lot of time cost. We've fixed a lot of bugs so far based on the existing architecture. On the negative side, I think these bugs may still appear with the new architecture, and we need to carefully evaluate the workload. I totally agree with the test-driven approach, but with the IT framework mentioned earlier, I wonder how IT is different and related to the current E2E framework? In addition to the cluster bug summary, I made a list on jira (https://issues.apache.org/jira/issues/?filter=12351069), you can look at it.

At present, there are more than 10 people working in the cluster module in the community. I think there is still room for improvement in the previous way of collaboration. If everyone has a clear goal and a clear division of labor, I believe we will do a good job.

But I totally agree with Jianyun.

> Before get started, We should know that the challenges really can't be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry    > about is when we spend a lot time on the new architecture but finally we just get very a few benefit.`
> I think we can follow the below steps to decide if we need a new architecture or not.

> 1.  List the features expect to support.
> 2.  Triage the features, filter out true requirements. (This can be done via vote.)
> 3.  Can these true requirements be implemented in the current architecture?
> 4. If the answer in the 3 steps is No. Then we really need a new architecture.

So far in the mailing list we've only talked about 1 and 2, not 3 and 4. Let's convince everyone in the community by presenting the next generation of architecture and explaining the advantages of the new architecture over the old one on these issues.


—————————
Xinyu Tan

> 在 2021年11月10日，下午9:22，songbinghua@iie.ac.cn 写道：
> 
> This is the features I concerned for the next architecture that I have proposed in my first discussion email:
> 
> 1. the num of nodes in cluster can support, 
> 2. the write and read performance in multi-replicas,
> 3. the linear ratio of read and write in cluster,
> 4. the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
> 5. the joint query between time series data and relational data,
> 6. the federal cluster query across data centers, AZ and regions,
> 7. the spatio-temporal data analysis in one DB etc…
> 
> 
> 宋秉华
> songbinghua@iie.ac.cn
> 
> 发件人： jianyun cheng
> 发送时间： 2021-11-10 21:06
> 收件人： dev@iotdb.apache.org
> 主题： RE: What is the next architecture of IoTDB?
> Totally agree with the point that we should have some new architecture.
> 
> But before get started, we should know that the challenges really can’t be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry about is when we spend a lot time on the new architecture but finally we just get very a few benefit.
> 
> I think we can follow the below steps to decide if we need a new architecture or not.
> 
> 
>  1.  List the features expect to support.
>  2.  Triage the features, filter out true requirements. (This can be done via vote.)
>  3.  Can these true requirements be implemented in the current architecture?
>  4.  If the answer in the 3 step is No. Then we really need a new architecture.
> 
> Architecture evolution is an eternal topic, we should know where we want to go in each step. It would be very great if you can share some ideas so that we could have some discussion.
> 
> How do you think?
> 
> ----------------------------------------------------------
> Jianyun Cheng
> Thanks!
> 
> From: songbinghua@iie.ac.cn<ma...@iie.ac.cn>
> Sent: Wednesday, November 10, 2021 10:48 AM
> To: dev<ma...@iotdb.apache.org>
> Subject: What is the next architecture of IoTDB?
> 
> What is the next architecture of IoTDB?
> 
> As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.
> 
> With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
> the num of nodes in cluster can support, the write and read performance in multi-replicas,
> the linear ratio of read and write in cluster,
> the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
> the joint query between time series data and relational data,
> the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…
> 
> So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
> The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.
> 
> It's time to action, WDYT？
> 
> 
> 
> Bruce Song 宋秉华
> icloudsong
>