You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@iotdb.apache.org by "songbinghua@iie.ac.cn" <so...@iie.ac.cn> on 2021/11/10 02:47:51 UTC

What is the next architecture of IoTDB?

What is the next architecture of IoTDB?

As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift. 

With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read performance in multi-replicas, 
the linear ratio of read and write in cluster, 
the efficiency and smartness of scheduler in cluster based on the dynamic workloads, 
the joint query between time series data and relational data, 
the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…

So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.

It's time to action, WDYT？



Bruce Song 宋秉华
icloudsong

Re: What is the next architecture of IoTDB?

Posted by HW-Chao Wang <57...@qq.com.INVALID>.

good，we should design a new architecture in next version&nbsp;

---Original---
From: "songbinghua@iie.ac.cn"<songbinghua@iie.ac.cn&gt;
Date: Wed, Nov 10, 2021 10:47 AM
To: "dev"<dev@iotdb.apache.org&gt;;
Subject: What is the next architecture of IoTDB?

What is the next architecture of IoTDB?

As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift. 

With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read performance in multi-replicas, 
the linear ratio of read and write in cluster, 
the efficiency and smartness of scheduler in cluster based on the dynamic workloads, 
the joint query between time series data and relational data, 
the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…

So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.

It's time to action, WDYT？

Bruce Song 宋秉华
icloudsong

Re: What is the next architecture of IoTDB?

Posted by Dawei Liu <at...@163.com>.

Hi,


We should also think about how to coordinate with the data center in the edge end and terminal


Dawei Liu
On 11/10/2021 21:22，songbinghua@iie.ac.cn<so...@iie.ac.cn> wrote：
This is the features I concerned for the next architecture that I have proposed in my first discussion email:

1. the num of nodes in cluster can support,
2. the write and read performance in multi-replicas,
3. the linear ratio of read and write in cluster,
4. the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
5. the joint query between time series data and relational data,
6. the federal cluster query across data centers, AZ and regions,
7. the spatio-temporal data analysis in one DB etc…


宋秉华
songbinghua@iie.ac.cn

发件人： jianyun cheng
发送时间： 2021-11-10 21:06
收件人： dev@iotdb.apache.org
主题： RE: What is the next architecture of IoTDB?
Totally agree with the point that we should have some new architecture.

But before get started, we should know that the challenges really can’t be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry about is when we spend a lot time on the new architecture but finally we just get very a few benefit.

I think we can follow the below steps to decide if we need a new architecture or not.


1.  List the features expect to support.
2.  Triage the features, filter out true requirements. (This can be done via vote.)
3.  Can these true requirements be implemented in the current architecture?
4.  If the answer in the 3 step is No. Then we really need a new architecture.

Architecture evolution is an eternal topic, we should know where we want to go in each step. It would be very great if you can share some ideas so that we could have some discussion.

How do you think?

----------------------------------------------------------
Jianyun Cheng
Thanks!

From: songbinghua@iie.ac.cn<ma...@iie.ac.cn>
Sent: Wednesday, November 10, 2021 10:48 AM
To: dev<ma...@iotdb.apache.org>
Subject: What is the next architecture of IoTDB?

What is the next architecture of IoTDB?

As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.

With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read performance in multi-replicas,
the linear ratio of read and write in cluster,
the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
the joint query between time series data and relational data,
the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…

So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.

It's time to action, WDYT？



Bruce Song 宋秉华
icloudsong

Re: What is the next architecture of IoTDB?

Posted by Eric Pai <er...@hotmail.com>.

I may be able to answer your question about the intention of the IT framework, as we have already had benefit from it😊.

The biggest advantage of the IT framework is to REUSE our existing IT cases in different cluster modes. Although the E2E test based on the test-container technology can test in cluster mode, we should migrate, or copy the IT cases from standalone to E2E test. It's hard to maintain and we need to do a lot of duplicated work. Our goal is to let the contributor develops an IT case once which can be used both in standalone, cluster mode and so on. What's more, running E2E test need a docker runtime, which is a little heavier than running directly. Now in order to decrease the time cost, all test
But I should admit that the IT framework has some disadvantages as well. It's good at testing normal client-server communication, but can't test some node failure/restarting scenarios.

For the test boundaries of IT framework and E2E testing, we can make a discussion later.

在 2021/11/11 上午10:02，“谭新宇”<10...@qq.com.INVALID> 写入:

    There's been a lot of talk about cluster new architectures, and stability has been a problem for a long time.

    For the former, I agree that we need to summarize our experience and discuss the problems with our current architecture and possible improvements. But it's not enough just to point out the problem. We need more detailed research and test results. In fact, it seems to me that the current problems are more a result of code implementation than cluster architecture.

    For the latter, I think it has less to do with cluster architecture and more to do with inadequate testing. It is very difficult to stabilize an efficient system and requires a lot of time cost. We've fixed a lot of bugs so far based on the existing architecture. On the negative side, I think these bugs may still appear with the new architecture, and we need to carefully evaluate the workload. I totally agree with the test-driven approach, but with the IT framework mentioned earlier, I wonder how IT is different and related to the current E2E framework? In addition to the cluster bug summary, I made a list on jira (https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fissues%2F%3Ffilter%3D12351069&amp;data=04%7C01%7C%7Cb372d00f1b9744913e2208d9a4b75077%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637721929505858330%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=faDKNED7xRXP21%2Fs62Zm%2Fg4cS3Um6RRWVoyQvAW216s%3D&amp;reserved=0), you can look at it.

    At present, there are more than 10 people working in the cluster module in the community. I think there is still room for improvement in the previous way of collaboration. If everyone has a clear goal and a clear division of labor, I believe we will do a good job.

    But I totally agree with Jianyun.

    > Before get started, We should know that the challenges really can't be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry    > about is when we spend a lot time on the new architecture but finally we just get very a few benefit.`
    > I think we can follow the below steps to decide if we need a new architecture or not.

    > 1.  List the features expect to support.
    > 2.  Triage the features, filter out true requirements. (This can be done via vote.)
    > 3.  Can these true requirements be implemented in the current architecture?
    > 4. If the answer in the 3 steps is No. Then we really need a new architecture.

    So far in the mailing list we've only talked about 1 and 2, not 3 and 4. Let's convince everyone in the community by presenting the next generation of architecture and explaining the advantages of the new architecture over the old one on these issues.


    —————————
    Xinyu Tan

    > 在 2021年11月10日，下午9:22，songbinghua@iie.ac.cn 写道：
    > 
    > This is the features I concerned for the next architecture that I have proposed in my first discussion email:
    > 
    > 1. the num of nodes in cluster can support, 
    > 2. the write and read performance in multi-replicas,
    > 3. the linear ratio of read and write in cluster,
    > 4. the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
    > 5. the joint query between time series data and relational data,
    > 6. the federal cluster query across data centers, AZ and regions,
    > 7. the spatio-temporal data analysis in one DB etc…
    > 
    > 
    > 宋秉华
    > songbinghua@iie.ac.cn
    > 
    > 发件人： jianyun cheng
    > 发送时间： 2021-11-10 21:06
    > 收件人： dev@iotdb.apache.org
    > 主题： RE: What is the next architecture of IoTDB?
    > Totally agree with the point that we should have some new architecture.
    > 
    > But before get started, we should know that the challenges really can’t be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry about is when we spend a lot time on the new architecture but finally we just get very a few benefit.
    > 
    > I think we can follow the below steps to decide if we need a new architecture or not.
    > 
    > 
    >  1.  List the features expect to support.
    >  2.  Triage the features, filter out true requirements. (This can be done via vote.)
    >  3.  Can these true requirements be implemented in the current architecture?
    >  4.  If the answer in the 3 step is No. Then we really need a new architecture.
    > 
    > Architecture evolution is an eternal topic, we should know where we want to go in each step. It would be very great if you can share some ideas so that we could have some discussion.
    > 
    > How do you think?
    > 
    > ----------------------------------------------------------
    > Jianyun Cheng
    > Thanks!
    > 
    > From: songbinghua@iie.ac.cn<ma...@iie.ac.cn>
    > Sent: Wednesday, November 10, 2021 10:48 AM
    > To: dev<ma...@iotdb.apache.org>
    > Subject: What is the next architecture of IoTDB?
    > 
    > What is the next architecture of IoTDB?
    > 
    > As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.
    > 
    > With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
    > the num of nodes in cluster can support, the write and read performance in multi-replicas,
    > the linear ratio of read and write in cluster,
    > the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
    > the joint query between time series data and relational data,
    > the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…
    > 
    > So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
    > The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.
    > 
    > It's time to action, WDYT？
    > 
    > 
    > 
    > Bruce Song 宋秉华
    > icloudsong
    >

Re: What is the next architecture of IoTDB?

Posted by 谭新宇 <10...@qq.com.INVALID>.

There's been a lot of talk about cluster new architectures, and stability has been a problem for a long time.

For the former, I agree that we need to summarize our experience and discuss the problems with our current architecture and possible improvements. But it's not enough just to point out the problem. We need more detailed research and test results. In fact, it seems to me that the current problems are more a result of code implementation than cluster architecture.

For the latter, I think it has less to do with cluster architecture and more to do with inadequate testing. It is very difficult to stabilize an efficient system and requires a lot of time cost. We've fixed a lot of bugs so far based on the existing architecture. On the negative side, I think these bugs may still appear with the new architecture, and we need to carefully evaluate the workload. I totally agree with the test-driven approach, but with the IT framework mentioned earlier, I wonder how IT is different and related to the current E2E framework? In addition to the cluster bug summary, I made a list on jira (https://issues.apache.org/jira/issues/?filter=12351069), you can look at it.

At present, there are more than 10 people working in the cluster module in the community. I think there is still room for improvement in the previous way of collaboration. If everyone has a clear goal and a clear division of labor, I believe we will do a good job.

But I totally agree with Jianyun.

> Before get started, We should know that the challenges really can't be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry    > about is when we spend a lot time on the new architecture but finally we just get very a few benefit.`
> I think we can follow the below steps to decide if we need a new architecture or not.

> 1.  List the features expect to support.
> 2.  Triage the features, filter out true requirements. (This can be done via vote.)
> 3.  Can these true requirements be implemented in the current architecture?
> 4. If the answer in the 3 steps is No. Then we really need a new architecture.

So far in the mailing list we've only talked about 1 and 2, not 3 and 4. Let's convince everyone in the community by presenting the next generation of architecture and explaining the advantages of the new architecture over the old one on these issues.


—————————
Xinyu Tan

> 在 2021年11月10日，下午9:22，songbinghua@iie.ac.cn 写道：
> 
> This is the features I concerned for the next architecture that I have proposed in my first discussion email:
> 
> 1. the num of nodes in cluster can support, 
> 2. the write and read performance in multi-replicas,
> 3. the linear ratio of read and write in cluster,
> 4. the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
> 5. the joint query between time series data and relational data,
> 6. the federal cluster query across data centers, AZ and regions,
> 7. the spatio-temporal data analysis in one DB etc…
> 
> 
> 宋秉华
> songbinghua@iie.ac.cn
> 
> 发件人： jianyun cheng
> 发送时间： 2021-11-10 21:06
> 收件人： dev@iotdb.apache.org
> 主题： RE: What is the next architecture of IoTDB?
> Totally agree with the point that we should have some new architecture.
> 
> But before get started, we should know that the challenges really can’t be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry about is when we spend a lot time on the new architecture but finally we just get very a few benefit.
> 
> I think we can follow the below steps to decide if we need a new architecture or not.
> 
> 
>  1.  List the features expect to support.
>  2.  Triage the features, filter out true requirements. (This can be done via vote.)
>  3.  Can these true requirements be implemented in the current architecture?
>  4.  If the answer in the 3 step is No. Then we really need a new architecture.
> 
> Architecture evolution is an eternal topic, we should know where we want to go in each step. It would be very great if you can share some ideas so that we could have some discussion.
> 
> How do you think?
> 
> ----------------------------------------------------------
> Jianyun Cheng
> Thanks!
> 
> From: songbinghua@iie.ac.cn<ma...@iie.ac.cn>
> Sent: Wednesday, November 10, 2021 10:48 AM
> To: dev<ma...@iotdb.apache.org>
> Subject: What is the next architecture of IoTDB?
> 
> What is the next architecture of IoTDB?
> 
> As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.
> 
> With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
> the num of nodes in cluster can support, the write and read performance in multi-replicas,
> the linear ratio of read and write in cluster,
> the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
> the joint query between time series data and relational data,
> the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…
> 
> So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
> The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.
> 
> It's time to action, WDYT？
> 
> 
> 
> Bruce Song 宋秉华
> icloudsong
>

回复: RE: What is the next architecture of IoTDB?

Posted by "songbinghua@iie.ac.cn" <so...@iie.ac.cn>.

This is the features I concerned for the next architecture that I have proposed in my first discussion email:

1. the num of nodes in cluster can support, 
2. the write and read performance in multi-replicas,
3. the linear ratio of read and write in cluster,
4. the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
5. the joint query between time series data and relational data,
6. the federal cluster query across data centers, AZ and regions,
7. the spatio-temporal data analysis in one DB etc…


宋秉华
songbinghua@iie.ac.cn
 
发件人： jianyun cheng
发送时间： 2021-11-10 21:06
收件人： dev@iotdb.apache.org
主题： RE: What is the next architecture of IoTDB?
Totally agree with the point that we should have some new architecture.
 
But before get started, we should know that the challenges really can’t be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry about is when we spend a lot time on the new architecture but finally we just get very a few benefit.
 
I think we can follow the below steps to decide if we need a new architecture or not.
 
 
  1.  List the features expect to support.
  2.  Triage the features, filter out true requirements. (This can be done via vote.)
  3.  Can these true requirements be implemented in the current architecture?
  4.  If the answer in the 3 step is No. Then we really need a new architecture.
 
Architecture evolution is an eternal topic, we should know where we want to go in each step. It would be very great if you can share some ideas so that we could have some discussion.
 
How do you think?
 
----------------------------------------------------------
Jianyun Cheng
Thanks!
 
From: songbinghua@iie.ac.cn<ma...@iie.ac.cn>
Sent: Wednesday, November 10, 2021 10:48 AM
To: dev<ma...@iotdb.apache.org>
Subject: What is the next architecture of IoTDB?
 
What is the next architecture of IoTDB?
 
As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.
 
With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read performance in multi-replicas,
the linear ratio of read and write in cluster,
the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
the joint query between time series data and relational data,
the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…
 
So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.
 
It's time to action, WDYT？
 
 
 
Bruce Song 宋秉华
icloudsong

RE: What is the next architecture of IoTDB?

Posted by jianyun cheng <ch...@outlook.com>.

Totally agree with the point that we should have some new architecture.

But before get started, we should know that the challenges really can’t be overcome in current architecture and IoTDB really needs the features in the challenge list. Each architecture has its pro and weakness. Actually what I worry about is when we spend a lot time on the new architecture but finally we just get very a few benefit.

I think we can follow the below steps to decide if we need a new architecture or not.


  1.  List the features expect to support.
  2.  Triage the features, filter out true requirements. (This can be done via vote.)
  3.  Can these true requirements be implemented in the current architecture?
  4.  If the answer in the 3 step is No. Then we really need a new architecture.

Architecture evolution is an eternal topic, we should know where we want to go in each step. It would be very great if you can share some ideas so that we could have some discussion.

How do you think?

----------------------------------------------------------
Jianyun Cheng
Thanks!

From: songbinghua@iie.ac.cn<ma...@iie.ac.cn>
Sent: Wednesday, November 10, 2021 10:48 AM
To: dev<ma...@iotdb.apache.org>
Subject: What is the next architecture of IoTDB?

What is the next architecture of IoTDB?

As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.

With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read performance in multi-replicas,
the linear ratio of read and write in cluster,
the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
the joint query between time series data and relational data,
the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…

So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.

It's time to action, WDYT？



Bruce Song 宋秉华
icloudsong

Re:What is the next architecture of IoTDB?

Posted by Dawei Liu <at...@163.com>.

Hi，


+1，

This allows us to align our goals, 
know where we need to go next while diving into technical details, 
and prepare dynamically at each step.




Dawei Liu
On 11/10/2021 16:04，CloudWise-Luke<28...@qq.com.INVALID> wrote：
&nbsp;This is a good suggestion. I think we can let a few people try to explore the next architecture while ensuring the high-quality development or optimization progress of the existing iotdb



CloudWiseluke.miao


&nbsp;




------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "dev"                                                                                    <songbinghua@iie.ac.cn&gt;;
发送时间:&nbsp;2021年11月10日(星期三) 上午10:47
收件人:&nbsp;"dev"<dev@iotdb.apache.org&gt;;

主题:&nbsp;What is the next architecture of IoTDB?



What is the next architecture of IoTDB?

As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.

With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read performance in multi-replicas,
the linear ratio of read and write in cluster,
the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
the joint query between time series data and relational data,
the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…

So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.

It's time to action, WDYT？



Bruce Song 宋秉华
icloudsong

回复：What is the next architecture of IoTDB?

Posted by CloudWise-Luke <28...@qq.com.INVALID>.

&nbsp;This is a good suggestion. I think we can let a few people try to explore the next architecture while ensuring the high-quality development or optimization progress of the existing iotdb



CloudWiseluke.miao


&nbsp;




------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "dev"                                                                                    <songbinghua@iie.ac.cn&gt;;
发送时间:&nbsp;2021年11月10日(星期三) 上午10:47
收件人:&nbsp;"dev"<dev@iotdb.apache.org&gt;;

主题:&nbsp;What is the next architecture of IoTDB?



What is the next architecture of IoTDB?

As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift. 

With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read performance in multi-replicas, 
the linear ratio of read and write in cluster, 
the efficiency and smartness of scheduler in cluster based on the dynamic workloads, 
the joint query between time series data and relational data, 
the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…

So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.

It's time to action, WDYT？



Bruce Song 宋秉华
icloudsong

Re: Re: What is the next architecture of IoTDB?

Posted by "songbinghua@iie.ac.cn" <so...@iie.ac.cn>.

I agree that the quality of cluster version of IoTDB is important and should  have the high priority, but I think these two things should proceed in parallel.
Without a good architecture, there is also no leading competitiveness to meet the challenge of competitor, and it is also impossible to gain good quality based on the inherently inadequate architecture.
Learning from a lot of well-known open source software, they all take into account strategic and operational objectives to keep the competitive power , such as the InfluxDB which have already gone into action, So taking precautions to urgent.
I will plan to create a new branch for the next architecture for POC in parallel, welcome to join us to make IoTDB great!




宋秉华
songbinghua@iie.ac.cn
 
From: Eric Pai
Date: 2021-11-10 11:14
To: dev@iotdb.apache.org
Subject: Re: What is the next architecture of IoTDB?
Hi,
 
It's nice to plan next generation architecture of IoTDB.
 
However, before we start to design and discuss the refinement and optimization, we should ensure the cluster quality as our expected. In our inner test, by replaying the existed IT cases in a cluster environment, we have found many basic function work abnormally, e.g. ORDER BY DESC query, aggregation query, some filters with wrong serialize/deserialize logic, NPE, and so on. These bugs are easy to reappear but no one has fully tested them, and there's not a framework to ensure that every new feature, or existing feature work well both in cluster and standalone environment. Manual test by contributors is a heavy work with less scenario coverage.
 
Designing new architecture is necessary, but I think the quality assurance may have the high priority than any other things. Good quality assurance make IoTDB to 60/100, and then we can make it to 100/100 by applying any optimizations.
 
在 2021/11/10 上午10:48，“songbinghua@iie.ac.cn”<so...@iie.ac.cn> 写入:
 
    What is the next architecture of IoTDB?
 
    As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift. 
 
    With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
    the num of nodes in cluster can support, the write and read performance in multi-replicas, 
    the linear ratio of read and write in cluster, 
    the efficiency and smartness of scheduler in cluster based on the dynamic workloads, 
    the joint query between time series data and relational data, 
    the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…
 
    So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
    The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.
 
    It's time to action, WDYT？
 
 
 
    Bruce Song 宋秉华
    icloudsong

Re: Re: What is the next architecture of IoTDB?

Posted by "songbinghua@iie.ac.cn" <so...@iie.ac.cn>.

I agree that the quality of cluster version of IoTDB is important and should  have the high priority, but I think these two things should proceed in parallel.
Without a good architecture, there is also no leading competitiveness to meet the challenge of competitor, and it is also impossible to gain good quality based on the inherently inadequate architecture. 

Learning from a lot of well-known open source software, they all take into account strategic and operational objectives to keep the competitive power , such as the InfluxDB which have already gone into action, So taking precautions is urgent. 

I will plan to create a new branch for the next architecture for POC in parallel, welcome to join us to make IoTDB great!




宋秉华
songbinghua@iie.ac.cn
From: Eric Pai
Date: 2021-11-10 11:14
To: dev@iotdb.apache.org
Subject: Re: What is the next architecture of IoTDB?
Hi,
It's nice to plan next generation architecture of IoTDB.
However, before we start to design and discuss the refinement and optimization, we should ensure the cluster quality as our expected. In our inner test, by replaying the existed IT cases in a cluster environment, we have found many basic function work abnormally, e.g. ORDER BY DESC query, aggregation query, some filters with wrong serialize/deserialize logic, NPE, and so on. These bugs are easy to reappear but no one has fully tested them, and there's not a framework to ensure that every new feature, or existing feature work well both in cluster and standalone environment. Manual test by contributors is a heavy work with less scenario coverage.
Designing new architecture is necessary, but I think the quality assurance may have the high priority than any other things. Good quality assurance make IoTDB to 60/100, and then we can make it to 100/100 by applying any optimizations.
在 2021/11/10 上午10:48，“songbinghua@iie.ac.cn”<so...@iie.ac.cn> 写入:
    What is the next architecture of IoTDB?
    As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.
    With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
    the num of nodes in cluster can support, the write and read performance in multi-replicas,
    the linear ratio of read and write in cluster,
    the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
    the joint query between time series data and relational data,
    the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…
    So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
    The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.
    It's time to action, WDYT？
    Bruce Song 宋秉华
    icloudsong

Re: Re: What is the next architecture of IoTDB?

Posted by "songbinghua@iie.ac.cn" <so...@iie.ac.cn>.

Could you gather the bugs/errors/defaults of IoTDB  from eveyone and make a list of it according to the priority so we can 
give different priority to dealing with?

宋秉华
songbinghua@iie.ac.cn

From: Houliang Qi
Date: 2021-11-10 15:17
To: dev@iotdb.apache.org
Subject: Re: What is the next architecture of IoTDB?
Hi, all
This is really a good suggestion. I also think it is necessary to rethink the design of distributed architecture.

The other important thing is if we start from scratch, the project may be relatively large, due to the high voice for the distributed version, and the community has also releases a 0.12 tested distributed version(only suggest tested, not suggest run it in product environment), I think we can first improve the distributed version that can be used in product environment based on the existing version.  

At the same time, when we are doing integration-test , we should consider the universality of the integration-test design, when we start to develop the next generation distributed architecture, the previously designed integration-test cases can be reused, So what we are doing now is meaningful.

Thanks,
---------------------------------------
Houliang Qi
BONC, Ltd

On 11/10/2021 12:35，Junqing Wang<ir...@gmail.com> wrote：
Agree with Eric Pai.

The quality of the cluster version is very important. For now, we should
focus more on quality rather than new architecture. Without the quality,
the cluster version is hard to use in the production environment.

Later I will contribute a PR with an improved IT testing framework that
reuses existing IT cases to replay in a cluster environment.

And I created a channel #integration-test in slack, everyone is welcome to
join the discussion.

Eric Pai <er...@hotmail.com> 于2021年11月10日周三 上午11:14写道：

Hi,

It's nice to plan next generation architecture of IoTDB.

However, before we start to design and discuss the refinement and
optimization, we should ensure the cluster quality as our expected. In our
inner test, by replaying the existed IT cases in a cluster environment, we
have found many basic function work abnormally, e.g. ORDER BY DESC query,
aggregation query, some filters with wrong serialize/deserialize logic,
NPE, and so on. These bugs are easy to reappear but no one has fully tested
them, and there's not a framework to ensure that every new feature, or
existing feature work well both in cluster and standalone environment.
Manual test by contributors is a heavy work with less scenario coverage.

Designing new architecture is necessary, but I think the quality assurance
may have the high priority than any other things. Good quality assurance
make IoTDB to 60/100, and then we can make it to 100/100 by applying any
optimizations.

在 2021/11/10 上午10:48，“songbinghua@iie.ac.cn”<so...@iie.ac.cn> 写入:

What is the next architecture of IoTDB?

As we know，IoTDB cluster version is based on the standalone
architecture and almost reused all the basic components of standalone
version and just add a multi-raft protocol based on the thrift.

With the application spreading, the performance of the cluster version
of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read
performance in multi-replicas,
the linear ratio of read and write in cluster,
the efficiency and smartness of scheduler in cluster based on the
dynamic workloads,
the joint query between time series data and relational data,
the federal cluster query across data centers, AZ and regions and the
spatio-temporal data analysis in one DB etc…

So, I suggest we should make a blueprint for the next architecture of
IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use
Fusion as the next query engine and adopt parquet as the file format etc.

It's time to action, WDYT？

Bruce Song 宋秉华
icloudsong

Re: What is the next architecture of IoTDB?

Posted by Houliang Qi <ne...@163.com>.

Hi, all
This is really a good suggestion. I also think it is necessary to rethink the design of distributed architecture.

The other important thing is if we start from scratch, the project may be relatively large, due to the high voice for the distributed version, and the community has also releases a 0.12 tested distributed version(only suggest tested, not suggest run it in product environment), I think we can first improve the distributed version that can be used in product environment based on the existing version.

At the same time, when we are doing integration-test , we should consider the universality of the integration-test design, when we start to develop the next generation distributed architecture, the previously designed integration-test cases can be reused, So what we are doing now is meaningful.

Thanks,
---------------------------------------
Houliang Qi
BONC, Ltd

On 11/10/2021 12:35，Junqing Wang<ir...@gmail.com> wrote：
Agree with Eric Pai.

The quality of the cluster version is very important. For now, we should
focus more on quality rather than new architecture. Without the quality,
the cluster version is hard to use in the production environment.

Later I will contribute a PR with an improved IT testing framework that
reuses existing IT cases to replay in a cluster environment.

And I created a channel #integration-test in slack, everyone is welcome to
join the discussion.

Eric Pai <er...@hotmail.com> 于2021年11月10日周三 上午11:14写道：

Hi,

It's nice to plan next generation architecture of IoTDB.

However, before we start to design and discuss the refinement and
optimization, we should ensure the cluster quality as our expected. In our
inner test, by replaying the existed IT cases in a cluster environment, we
have found many basic function work abnormally, e.g. ORDER BY DESC query,
aggregation query, some filters with wrong serialize/deserialize logic,
NPE, and so on. These bugs are easy to reappear but no one has fully tested
them, and there's not a framework to ensure that every new feature, or
existing feature work well both in cluster and standalone environment.
Manual test by contributors is a heavy work with less scenario coverage.

Designing new architecture is necessary, but I think the quality assurance
may have the high priority than any other things. Good quality assurance
make IoTDB to 60/100, and then we can make it to 100/100 by applying any
optimizations.

在 2021/11/10 上午10:48，“songbinghua@iie.ac.cn”<so...@iie.ac.cn> 写入:

What is the next architecture of IoTDB?

As we know，IoTDB cluster version is based on the standalone
architecture and almost reused all the basic components of standalone
version and just add a multi-raft protocol based on the thrift.

With the application spreading, the performance of the cluster version
of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read
performance in multi-replicas,
the linear ratio of read and write in cluster,
the efficiency and smartness of scheduler in cluster based on the
dynamic workloads,
the joint query between time series data and relational data,
the federal cluster query across data centers, AZ and regions and the
spatio-temporal data analysis in one DB etc…

So, I suggest we should make a blueprint for the next architecture of
IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use
Fusion as the next query engine and adopt parquet as the file format etc.

It's time to action, WDYT？

Bruce Song 宋秉华
icloudsong

Re: What is the next architecture of IoTDB?

Posted by Junqing Wang <ir...@gmail.com>.

Agree with Eric Pai.

The quality of the cluster version is very important. For now, we should
focus more on quality rather than new architecture. Without the quality,
the cluster version is hard to use in the production environment.


Later I will contribute a PR with an improved IT testing framework that
reuses existing IT cases to replay in a cluster environment.


And I created a channel #integration-test in slack, everyone is welcome to
join the discussion.

Eric Pai <er...@hotmail.com> 于2021年11月10日周三 上午11:14写道：

> Hi,
>
> It's nice to plan next generation architecture of IoTDB.
>
> However, before we start to design and discuss the refinement and
> optimization, we should ensure the cluster quality as our expected. In our
> inner test, by replaying the existed IT cases in a cluster environment, we
> have found many basic function work abnormally, e.g. ORDER BY DESC query,
> aggregation query, some filters with wrong serialize/deserialize logic,
> NPE, and so on. These bugs are easy to reappear but no one has fully tested
> them, and there's not a framework to ensure that every new feature, or
> existing feature work well both in cluster and standalone environment.
> Manual test by contributors is a heavy work with less scenario coverage.
>
> Designing new architecture is necessary, but I think the quality assurance
> may have the high priority than any other things. Good quality assurance
> make IoTDB to 60/100, and then we can make it to 100/100 by applying any
> optimizations.
>
> 在 2021/11/10 上午10:48，“songbinghua@iie.ac.cn”<so...@iie.ac.cn> 写入:
>
>     What is the next architecture of IoTDB?
>
>     As we know，IoTDB cluster version is based on the standalone
> architecture and almost reused all the basic components of standalone
> version and just add a multi-raft protocol based on the thrift.
>
>     With the application spreading, the performance of the cluster version
> of IoTDB is changllaged for many aspects：
>     the num of nodes in cluster can support, the write and read
> performance in multi-replicas,
>     the linear ratio of read and write in cluster,
>     the efficiency and smartness of scheduler in cluster based on the
> dynamic workloads,
>     the joint query between time series data and relational data,
>     the federal cluster query across data centers, AZ and regions and the
> spatio-temporal data analysis in one DB etc…
>
>     So, I suggest we should make a blueprint for the next architecture of
> IoTDB and we can learn the advantage of the InfluxDB.
>     The InfluxDB has designed the next architecture named IOx and use
> Fusion as the next query engine and adopt parquet as the file format etc.
>
>     It's time to action, WDYT？
>
>
>
>     Bruce Song 宋秉华
>     icloudsong
>
>
>

Re: What is the next architecture of IoTDB?

Posted by Eric Pai <er...@hotmail.com>.

Hi,

It's nice to plan next generation architecture of IoTDB.

However, before we start to design and discuss the refinement and optimization, we should ensure the cluster quality as our expected. In our inner test, by replaying the existed IT cases in a cluster environment, we have found many basic function work abnormally, e.g. ORDER BY DESC query, aggregation query, some filters with wrong serialize/deserialize logic, NPE, and so on. These bugs are easy to reappear but no one has fully tested them, and there's not a framework to ensure that every new feature, or existing feature work well both in cluster and standalone environment. Manual test by contributors is a heavy work with less scenario coverage.

Designing new architecture is necessary, but I think the quality assurance may have the high priority than any other things. Good quality assurance make IoTDB to 60/100, and then we can make it to 100/100 by applying any optimizations.

在 2021/11/10 上午10:48，“songbinghua@iie.ac.cn”<so...@iie.ac.cn> 写入:

What is the next architecture of IoTDB?

As we know，IoTDB cluster version is based on the standalone architecture and almost reused all the basic components of standalone version and just add a multi-raft protocol based on the thrift.

With the application spreading, the performance of the cluster version of IoTDB is changllaged for many aspects：
the num of nodes in cluster can support, the write and read performance in multi-replicas,
the linear ratio of read and write in cluster,
the efficiency and smartness of scheduler in cluster based on the dynamic workloads,
the joint query between time series data and relational data,
the federal cluster query across data centers, AZ and regions and the spatio-temporal data analysis in one DB etc…

So, I suggest we should make a blueprint for the next architecture of IoTDB and we can learn the advantage of the InfluxDB.
The InfluxDB has designed the next architecture named IOx and use Fusion as the next query engine and adopt parquet as the file format etc.

It's time to action, WDYT？

Bruce Song 宋秉华
icloudsong