You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iotdb.apache.org by Xiangdong Huang <sa...@gmail.com> on 2019/01/01 15:53:02 UTC

Plans of January of 2019

Hi all,

Happy New Year! Thank all developers and Mentors (and the Champion) of
IoTDB for your efforts in 2018.

I think the most important things in January of 2019 is migrating all codes
to Apache repository and finishing the website of IoTDB.

Tasks of the website:
- @Xinyi Zhao and @Yi Xu, can you organize contributors to finish the
website, as well as  the English documents?
- By the way, I suggest that we write more documents/JavaDocs to help new
developers to join our project.

Tasks of migration to Apache repository:
- There are 4 steps:
(1) solving the problem of the IO speed of flushing WAL on disk. @Rui Liu,
can you organize developers to solve it?
(2) Merge the branch `kill_Thanos` into the master  on thulab
repository. @Gaofei Cao.
(3) Add Apache header, change all package names, and add JavaDocs for each
java file.
(4) Migrating codes from thulab to Apache repository.
- By the way, don't forget that there are more than 100 issues on
github.com/thulab/iotdb/issues.

To finish all the above, we need the effort of everyone in our community!

Here's to a great 2019!

Thanks,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院

Re: Plans of January of 2019 : the features of kill_thanos branch

Posted by Xiangdong Huang <sa...@gmail.com>.
Hi Gaofei,

I suggest that we open a new mail subject to discuss about this. It is
friendly for those who wants to search mails by subject.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Gaofei Cao <cg...@foxmail.com> 于2019年1月4日周五 下午11:22写道:

> kill_thanos branch in thulab/iotdb has refactored most features as below.
>
>
> 1. Replacing the single point calculation logic with a batch data load
> behavior.
> In previous branch, the most important two methods in the `Reader` of
> IoTDB are `hasNext` and `next` methods, which examine that whether the
> given query series has next point and calculate next point. Multiple
> invoking of these two methods decreasing the performance of query, so we
> added two new methods `hasNextBatch` and `nextBatch`. As a result, we will
> load and transfer data in batch rather than a single point. These two
> methods are friendly to CPU.
>
>
> 2. Using nio.
> In this branch, we replaced ByteArrayInputStream with NIO, taking the
> advantage of java NIO. We used `Channel`, `Buffer`, `MMap` more frequently.
>
>
> 3. Adding file stream manager.
> In a query of IoTDB, multiple series may be queried, such as a sql `select
> * from root.vehicle`. To avoid opened one tsfile multiple times, we
> adopting a file stream manager, which ensure that one file will be opened
> at most once in IoTDB queries. We adopt an `ExpiredTimeMap` to manage
> opened file streams, and close some files when they are not used for a
> given expired time.  Maybe there are better file stream reader management
> methods, I will keep trace it.
>
>
> 4. Optimizing filter efficiency.
> Firstly, we removed the previously `Visitor Pattern` implementation of
> filter, and adopted an intuitive implementation.
> Secondly,  we optimized some filter logic to promote performance. For
> example, in a sql `select sensor_0, sensor_1 from device_0 where sensor_1 >
> 10`, we did some optimization to avoid  the duplicate data calculation of
> `sensor_1`.
>
>
> 5. Others, such as removing serialization of thrift, changing the file
> format of TsFile, maybe someone else can make a supplement.
>
>
> I suggest that merging it into master branch in the next week.
>
>
> Experimental results show that the query test in kill_thanos branch has
> approximately 30% ~ 60% performance promotion.
>
>
> By the way, I am considering that how to get a standard, convincing test
> data (in IoT domain) to test the writing and querying performance of
> IoTDB.  Currently, we just use the data generated by `IoTDB Benchmark`
> (another project, also available on github.com/thulab/iotdb-benchmark),
> which generated 10w row records of 100device * 100sensor.
>
>
> Thanks & Best Regards
>
>
> -----------------------------------
> Cao Gaofei (曹高飞)
> School of Software,
> Tsinghua University
> -----------------------------------
>
>
>
>
>
>
>
> ------------------ Original ------------------
> From:  "黄向东"<sa...@gmail.com>;
> Date:  Tue, Jan 1, 2019 11:53 PM
> To:  "dev"<de...@iotdb.apache.org>;
>
> Subject:  Plans of January of 2019
>
>
>
> Hi all,
>
> Happy New Year! Thank all developers and Mentors (and the Champion) of
> IoTDB for your efforts in 2018.
>
> I think the most important things in January of 2019 is migrating all codes
> to Apache repository and finishing the website of IoTDB.
>
> Tasks of the website:
> - @Xinyi Zhao and @Yi Xu, can you organize contributors to finish the
> website, as well as  the English documents?
> - By the way, I suggest that we write more documents/JavaDocs to help new
> developers to join our project.
>
> Tasks of migration to Apache repository:
> - There are 4 steps:
> (1) solving the problem of the IO speed of flushing WAL on disk. @Rui Liu,
> can you organize developers to solve it?
> (2) Merge the branch `kill_Thanos` into the master  on thulab
> repository. @Gaofei Cao.
> (3) Add Apache header, change all package names, and add JavaDocs for each
> java file.
> (4) Migrating codes from thulab to Apache repository.
> - By the way, don't forget that there are more than 100 issues on
> github.com/thulab/iotdb/issues.
>
> To finish all the above, we need the effort of everyone in our community!
>
> Here's to a great 2019!
>
> Thanks,
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
>
>  黄向东
> 清华大学 软件学院

Re: Plans of January of 2019 : the features of kill_thanos branch

Posted by Gaofei Cao <cg...@foxmail.com>.
kill_thanos branch in thulab/iotdb has refactored most features as below.


1. Replacing the single point calculation logic with a batch data load behavior.
In previous branch, the most important two methods in the `Reader` of IoTDB are `hasNext` and `next` methods, which examine that whether the given query series has next point and calculate next point. Multiple invoking of these two methods decreasing the performance of query, so we added two new methods `hasNextBatch` and `nextBatch`. As a result, we will load and transfer data in batch rather than a single point. These two methods are friendly to CPU.


2. Using nio.
In this branch, we replaced ByteArrayInputStream with NIO, taking the advantage of java NIO. We used `Channel`, `Buffer`, `MMap` more frequently. 


3. Adding file stream manager.
In a query of IoTDB, multiple series may be queried, such as a sql `select * from root.vehicle`. To avoid opened one tsfile multiple times, we adopting a file stream manager, which ensure that one file will be opened at most once in IoTDB queries. We adopt an `ExpiredTimeMap` to manage opened file streams, and close some files when they are not used for a given expired time.  Maybe there are better file stream reader management methods, I will keep trace it.


4. Optimizing filter efficiency.
Firstly, we removed the previously `Visitor Pattern` implementation of filter, and adopted an intuitive implementation.
Secondly,  we optimized some filter logic to promote performance. For example, in a sql `select sensor_0, sensor_1 from device_0 where sensor_1 > 10`, we did some optimization to avoid  the duplicate data calculation of `sensor_1`.


5. Others, such as removing serialization of thrift, changing the file format of TsFile, maybe someone else can make a supplement.


I suggest that merging it into master branch in the next week.


Experimental results show that the query test in kill_thanos branch has approximately 30% ~ 60% performance promotion. 


By the way, I am considering that how to get a standard, convincing test data (in IoT domain) to test the writing and querying performance of IoTDB.  Currently, we just use the data generated by `IoTDB Benchmark` (another project, also available on github.com/thulab/iotdb-benchmark), which generated 10w row records of 100device * 100sensor. 


Thanks & Best Regards 


-----------------------------------
Cao Gaofei (曹高飞)
School of Software,
Tsinghua University
-----------------------------------


 




------------------ Original ------------------
From:  "黄向东"<sa...@gmail.com>;
Date:  Tue, Jan 1, 2019 11:53 PM
To:  "dev"<de...@iotdb.apache.org>;

Subject:  Plans of January of 2019



Hi all,

Happy New Year! Thank all developers and Mentors (and the Champion) of
IoTDB for your efforts in 2018.

I think the most important things in January of 2019 is migrating all codes
to Apache repository and finishing the website of IoTDB.

Tasks of the website:
- @Xinyi Zhao and @Yi Xu, can you organize contributors to finish the
website, as well as  the English documents?
- By the way, I suggest that we write more documents/JavaDocs to help new
developers to join our project.

Tasks of migration to Apache repository:
- There are 4 steps:
(1) solving the problem of the IO speed of flushing WAL on disk. @Rui Liu,
can you organize developers to solve it?
(2) Merge the branch `kill_Thanos` into the master  on thulab
repository. @Gaofei Cao.
(3) Add Apache header, change all package names, and add JavaDocs for each
java file.
(4) Migrating codes from thulab to Apache repository.
- By the way, don't forget that there are more than 100 issues on
github.com/thulab/iotdb/issues.

To finish all the above, we need the effort of everyone in our community!

Here's to a great 2019!

Thanks,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院

Re:Re: Plans of January of 2019

Posted by 徐毅 <xu...@126.com>.
Hi


I have some questions about ASF's Jenkins.
1) Can anyone who forks our project use this jenkens when he creates a pull request from his forked project?
2) In our own jenkins, we create a pipeline to run unit test for every sub-module. Specially, for iotdb sub-module, we have two pipelines which run unit test and integration test. See the following picture, we run our tests on different os(now we have win10 and ubuntu, we will add osx soon) and different jdk versions(jdk8 and jdk11). Some of them are time-consuming and require lots of memory resources(our integration test takes 5 minutes for each run). So I wonder whether ASF's Jenkins can provide such capabilities to run multiple tests?






Thanks

Yi Xu


At 2019-01-02 01:28:26, "Christofer Dutz" <ch...@c-ware.de> wrote:
>Hi all,
>
>a happy new year from me too ... 
>I think we should add one or two more steps:
>
>(5) setup the Jenkins build on ASF's Jenkins (I'd be glad to help with that)
>
>Regarding the issues ... ever thought of programmatically migrating them to Jira? Sort of a run-once program, that pulls the issues from github and posts them in Jira?
>Done something like that several times before (however never for Apache projects)
>
>Chris
>
>
>
>
>Am 01.01.19, 16:53 schrieb "Xiangdong Huang" <sa...@gmail.com>:
>
>    Hi all,
>    
>    Happy New Year! Thank all developers and Mentors (and the Champion) of
>    IoTDB for your efforts in 2018.
>    
>    I think the most important things in January of 2019 is migrating all codes
>    to Apache repository and finishing the website of IoTDB.
>    
>    Tasks of the website:
>    - @Xinyi Zhao and @Yi Xu, can you organize contributors to finish the
>    website, as well as  the English documents?
>    - By the way, I suggest that we write more documents/JavaDocs to help new
>    developers to join our project.
>    
>    Tasks of migration to Apache repository:
>    - There are 4 steps:
>    (1) solving the problem of the IO speed of flushing WAL on disk. @Rui Liu,
>    can you organize developers to solve it?
>    (2) Merge the branch `kill_Thanos` into the master  on thulab
>    repository. @Gaofei Cao.
>    (3) Add Apache header, change all package names, and add JavaDocs for each
>    java file.
>    (4) Migrating codes from thulab to Apache repository.
>    - By the way, don't forget that there are more than 100 issues on
>    github.com/thulab/iotdb/issues.
>    
>    To finish all the above, we need the effort of everyone in our community!
>    
>    Here's to a great 2019!
>    
>    Thanks,
>    -----------------------------------
>    Xiangdong Huang
>    School of Software, Tsinghua University
>    
>     黄向东
>    清华大学 软件学院
>    
>

Re: Plans of January of 2019

Posted by "Kevin A. McGrail" <km...@apache.org>.
I think you should remain the master branch kill thanks.  Love it.  Happy
new year all!

On Tue, Jan 1, 2019, 17:28 Christofer Dutz <christofer.dutz@c-ware.de wrote:

> Hi all,
>
> a happy new year from me too ...
> I think we should add one or two more steps:
>
> (5) setup the Jenkins build on ASF's Jenkins (I'd be glad to help with
> that)
>
> Regarding the issues ... ever thought of programmatically migrating them
> to Jira? Sort of a run-once program, that pulls the issues from github and
> posts them in Jira?
> Done something like that several times before (however never for Apache
> projects)
>
> Chris
>
>
>
>
> Am 01.01.19, 16:53 schrieb "Xiangdong Huang" <sa...@gmail.com>:
>
>     Hi all,
>
>     Happy New Year! Thank all developers and Mentors (and the Champion) of
>     IoTDB for your efforts in 2018.
>
>     I think the most important things in January of 2019 is migrating all
> codes
>     to Apache repository and finishing the website of IoTDB.
>
>     Tasks of the website:
>     - @Xinyi Zhao and @Yi Xu, can you organize contributors to finish the
>     website, as well as  the English documents?
>     - By the way, I suggest that we write more documents/JavaDocs to help
> new
>     developers to join our project.
>
>     Tasks of migration to Apache repository:
>     - There are 4 steps:
>     (1) solving the problem of the IO speed of flushing WAL on disk. @Rui
> Liu,
>     can you organize developers to solve it?
>     (2) Merge the branch `kill_Thanos` into the master  on thulab
>     repository. @Gaofei Cao.
>     (3) Add Apache header, change all package names, and add JavaDocs for
> each
>     java file.
>     (4) Migrating codes from thulab to Apache repository.
>     - By the way, don't forget that there are more than 100 issues on
>     github.com/thulab/iotdb/issues.
>
>     To finish all the above, we need the effort of everyone in our
> community!
>
>     Here's to a great 2019!
>
>     Thanks,
>     -----------------------------------
>     Xiangdong Huang
>     School of Software, Tsinghua University
>
>      黄向东
>     清华大学 软件学院
>
>
>

Re: Plans of January of 2019

Posted by Christofer Dutz <ch...@c-ware.de>.
Hi all,

a happy new year from me too ... 
I think we should add one or two more steps:

(5) setup the Jenkins build on ASF's Jenkins (I'd be glad to help with that)

Regarding the issues ... ever thought of programmatically migrating them to Jira? Sort of a run-once program, that pulls the issues from github and posts them in Jira?
Done something like that several times before (however never for Apache projects)

Chris




Am 01.01.19, 16:53 schrieb "Xiangdong Huang" <sa...@gmail.com>:

    Hi all,
    
    Happy New Year! Thank all developers and Mentors (and the Champion) of
    IoTDB for your efforts in 2018.
    
    I think the most important things in January of 2019 is migrating all codes
    to Apache repository and finishing the website of IoTDB.
    
    Tasks of the website:
    - @Xinyi Zhao and @Yi Xu, can you organize contributors to finish the
    website, as well as  the English documents?
    - By the way, I suggest that we write more documents/JavaDocs to help new
    developers to join our project.
    
    Tasks of migration to Apache repository:
    - There are 4 steps:
    (1) solving the problem of the IO speed of flushing WAL on disk. @Rui Liu,
    can you organize developers to solve it?
    (2) Merge the branch `kill_Thanos` into the master  on thulab
    repository. @Gaofei Cao.
    (3) Add Apache header, change all package names, and add JavaDocs for each
    java file.
    (4) Migrating codes from thulab to Apache repository.
    - By the way, don't forget that there are more than 100 issues on
    github.com/thulab/iotdb/issues.
    
    To finish all the above, we need the effort of everyone in our community!
    
    Here's to a great 2019!
    
    Thanks,
    -----------------------------------
    Xiangdong Huang
    School of Software, Tsinghua University
    
     黄向东
    清华大学 软件学院