You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iotdb.apache.org by "xiangdong Huang (JIRA)" <ji...@apache.org> on 2019/07/15 12:26:00 UTC
[jira] [Closed] (IOTDB-90) [discuss] take FileNodeProcessorStore away

     [ https://issues.apache.org/jira/browse/IOTDB-90?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

xiangdong Huang closed IOTDB-90.
--------------------------------

now only a version file exists in the system folder.

The side-effect is that the time cost of startup is longer than before...

> [discuss] take FileNodeProcessorStore away
> ------------------------------------------
>
>                 Key: IOTDB-90
>                 URL: https://issues.apache.org/jira/browse/IOTDB-90
>             Project: Apache IoTDB
>          Issue Type: Sub-task
>            Reporter: xiangdong Huang
>            Priority: Minor
>              Labels: design, discussion
>
> While each Storage Group (SG) has a FileNodeProcessor in an IoTDB instance, FileNodeProcessorStore is for saving some information about the FileNodeProcessor. 
> I conjecture that using FileNodeProcessorStore is for accelerating the startup process of IoTDB, because it stores:
> {code:java}
> // code placeholder
> private boolean isOverflowed;
> private Map<String, Long> lastUpdateTimeMap;
> private TsFileResource emptyTsFileResource;
> private List<TsFileResource> newFileNodes;
> private int numOfMergeFile;
> private FileNodeProcessorStatus fileNodeProcessorStatus;
> {code}
>  Using the above info, we know whether a SG has overflow files, the last update time  for each devices*, all tsfiles**, and whether the filenode is in a merge process when the last IoTDB instance was shutdown.
> * last update time != last flush time: last update time is the max timestamp for a device and the corresponding data point may be in memtable, while the last flush time means the max timestamp for a device on disk (TsFiles).  
>  ** The most useful info in each TsFileResource is that it stores the start time and the end time of each device. 
> If we have a ProcessorStore file like the above, then we can quickly restore all the above info.  However, we can get all the above info without expensive cost even if we have no such a Store file:
>  * isOverflowed: if the corresponding overflow folder has files, it is true.
>  * lastUpdateTimeMap: if we must recover all data in WAL first, and then provide service for CRUD, then this field is unnecessary.
>  * emptyTsFileResource and newFileNodes: for the most important info (the start time and the end time of each device), we can get it from each TsFile by just reading its fileMetadata;
>  * fileNodeProcessorStatus: seems also unnecessary if we discard all unfinished process.
> So, I think we can have a try to remove this class. In this way, I think the write speed can be better and the IOPS of disk can be reduced. But it is not in hurry to do that.
> Maybe the class plays other roles that I do not know, so I leave this discussion here.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)