You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iotdb.apache.org by 孙泽嵩 <sz...@mails.tsinghua.edu.cn> on 2019/10/18 14:21:39 UTC

Re: [jira] [Created] (IOTDB-234) Refactor TsFile storage on HDFS

Hi, I have finished working on refactoring TsFile storage on HDFS. Here is my PR [1].

TSFile storage on different file systems are implemented by factory pattern. When starting IoTDB, FSFactoryProducer will produce different factories according to user configuration: `LocalFSFactory`, `LocalFSInputFactory` and `LocalFSOutputFactory` for local file system, while `HDFSFactory`, `HDFSInputFactory` and `HDFSOutputFactory` for HDFS.

From now on, TSFile module will not depend on Hadoop libs. If you want to store TSFile and related data files in HDFS, here are the steps:

1. Build server and Hadoop module by: `mvn clean package -pl server,hadoop -am -Dmaven.test.skip=true`

2. Then, copy the target jar of Hadoop module `hadoop-tsfile-0.9.0-SNAPSHOT-jar-with-dependencies.jar` into server target lib folder `.../server/target/iotdb-server-0.9.0-SNAPSHOT/lib`.

3. Edit user config in `iotdb-engine.properties`. Start server, and Tsfile will be stored on HDFS.

If you'd like to reset storage file system to local, just edit configuration `tsfile_storage_fs` to `LOCAL`. In this situation, if you have already had some data files on HDFS, you should either download them to local and move them to your config data file folder (`../server/target/iotdb-server-0.9.0-SNAPSHOT/data/data` by default), or restart your process and import data to IoTDB.

After the change of document structure is finished and merged into master, I will add a detailed User Guide about shared storage architecture to the document (Here is the draft: [2] for English and [3] for Chinese).

If you have any suggestions and ideas, please discuss with me.


P.S. The merging of PR [1] may result in conflicts in other recent PRs because of the refactor… If you meet conflicts in your PR, I’ll be very glad to help you resolve them. : )


[1] https://github.com/apache/incubator-iotdb/pull/417 <https://github.com/apache/incubator-iotdb/pull/417>
[2] https://github.com/samperson1997/incubator-iotdb/blob/6a8b8188c29e158d35652ddf14fdb5267329606c/docs/Documentation/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md <https://github.com/samperson1997/incubator-iotdb/blob/6a8b8188c29e158d35652ddf14fdb5267329606c/docs/Documentation/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md>
[3] https://github.com/samperson1997/incubator-iotdb/blob/hdfs_doc/docs/Documentation-CHN/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md <https://github.com/samperson1997/incubator-iotdb/blob/hdfs_doc/docs/Documentation-CHN/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md>


Best,
-----------------------------------
Zesong Sun
School of Software, Tsinghua University

孙泽嵩
清华大学 软件学院

> 2019年9月22日 18:23,Zesong Sun (Jira) <ji...@apache.org> 写道:
> 
> Zesong Sun created IOTDB-234:
> --------------------------------
> 
>             Summary: Refactor TsFile storage on HDFS
>                 Key: IOTDB-234
>                 URL: https://issues.apache.org/jira/browse/IOTDB-234
>             Project: Apache IoTDB
>          Issue Type: Improvement
>            Reporter: Zesong Sun
> 
> 
> Refactor TsFile storage on HDFS codes:
> * Extract the FileSystem factories into Hadoop module from TSFile module, so that TSFile module will not depend on Hadoop libs.
> * Use Java Reflection to get FileSystem factories in TSFile module.
> 
> 
> 
> --
> This message was sent by Atlassian Jira
> (v8.3.4#803005)