You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iotdb.apache.org by Zhou Yifu <ef...@outlook.com> on 2022/05/23 12:45:21 UTC
回复: Re: Flush function in cluster

Hi all,
According to what you discussed earlier, in my understanding, currently this operation should be the same as the pervious version, mainly used for debugging?  If it is mainly used for debugging, I think it is OK to redefine it and add more detail to this operation. But if we want this operation as a frequently used command in this new cluster version, it is recommended to be very careful and wait for this command to be stable before releasing it. In pervious cluster version, I remember flush had some bugs and it is hard for us to recover it. Maybe currently can add some attention notes to this commend in user guide and tell the user should use in caution.

Thanks,
Yifu Zhou

发件人: Jialin Qiao<ma...@apache.org>
发送时间: 2022年5月23日 12:51
收件人: dev@iotdb.apache.org<ma...@iotdb.apache.org>
主题: Re: Re: Flush function in cluster

Hi,

flush could be used in the following scenarios:

1. Test the compression ratio: A user writes some data into IoTDB and wants
to get the compression ratio, so he needs to run flush to clear the wal.
2. DBA wants to debug a datanode to see if the bug is from the memtable or
TsFile.
3. Developers of IoTDB write IT, flush will help build different cases.

As for show datanodes, these commands could be only used by the root user.

Thanks,
―――――――――――――――――
Jialin Qiao
Apache IoTDB PMC


jianyun cheng <ch...@outlook.com> 于2022年5月23日周一 12:10写道：

> Who can execute the flush operation?
>
> This is a very dangerous operation which may block the data ingestion. So
> the permission for such commands are very important which should only limit
> the DBA to execute in my oponion. The same limitation should apply to other
> similar OP commands like list cluster data/config nodes, show cluster
> configuration, show region set on some data nodes… when we have. These
> commands are very helpful to help DBA know the cluster status and should
> not run by any other users.
>
> It’s better to separate such OP commands and data operation commands.
>
> ----------------------------------------------------------
> Jianyun Cheng
> Thanks
>
> From: Jialin Qiao<ma...@apache.org>
> Sent: Monday, May 23, 2022 11:55 AM
> To: dev@iotdb.apache.org<ma...@iotdb.apache.org>
> Subject: Re: Re: Flush function in cluster
>
> Hi,
>
> In the previous version, flush is mainly used for debugging.
> Indeed, before shutdown, we want to do a flush to acceperate restarting,
> this could be bound in the stop-server.sh.
>
> In the data region, flush could be seen as a read operation, no need to
> keep all replicas having the same data format(wal or tsfile), as long as
> they have the same data point.
>
> Thanks,
> ―――――――――――――――――
> Jialin Qiao
> Apache IoTDB PMC
>
>
> 李思佳 <li...@360.cn> 于2022年5月23日周一 11:47写道：
>
> > " flush can reduce memory and speed up the restart process" , this
> assumes
> > that all copies have been flushed synchronously, so we can ensure that
> the
> > data files are logically consistent at this point.
> >
> > The operation of datanode flushing should be the process of resource
> > release before the node is shutdown(but this does not guarantee that all
> > copies are logically consistent at this point). For example, shutdownHook
> > requires the default disk flushing and resource release. We need to
> provide
> > a flush command scenario, perhaps because our node shutdown operation is
> > not incomplete?
> >
> > BR,
> > -----------------------------------
> > Sijia Li
> >
> >
> > -----邮件原件-----
> > 发件人: Xiangdong Huang <sa...@gmail.com>
> > 发送时间: 2022年5月23日 11:37
> > 收件人: dev <de...@iotdb.apache.org>
> > 主题: Re: Flush function in cluster
> >
> > I think distinguishing flushing on one node or on the cluster has its
> > meaning.
> >
> > As you said, flush can reduce memory and speed up the restart process.
> So,
> > how about if the DBA just wants to restart one node..
> >
> > However, the default behavior can be discussed: flush on one node by
> > default or on the whole cluster by default.
> >
> > -----------------------------------
> > Xiangdong Huang
> > School of Software, Tsinghua University
> >
> >  黄向东
> > 清华大学 软件学院
> >
> >
> > 李思佳 <li...@360.cn> 于2022年5月23日周一 11:28写道：
> >
> > > Sorry, I don't understand what the purpose and use of flushing current
> > > datanode is.
> > >
> > > IMO, flush all should mean that all storage group could be flushed, in
> > > another word, flush sg is a subset of flush all.
> > >
> > > For users, distributed is a black box, while SG is an exposed
> structure.
> > > Therefore, for cli commands, there is no need to be aware of the
> > > relationship between the datanode and the self-created SG.
> > >
> > > In addition, the Flush operation may speed up our restart recovery
> > > process. For example, when we flush an SG successfully, we can label
> > > the associated data files to indicate that all copies are consistent
> > > at that point in time(here are flush and write priorities). During the
> > > next restart, we can use this flag to quickly skip the verification
> step.
> > >
> > > In summary, here are my questions and thoughts:
> > > 1. Is it necessary to flush a dataNode? What are the benefits of this?
> > > 2. Can the Flush operation affect the consensus group or WAL for a
> > > quick restart?
> > >
> > > BR,
> > > -----------------------------------
> > > Sijia Li
> > >
> > >
> > > -----邮件原件-----
> > > 发件人: Jialin Qiao <qi...@apache.org>
> > > 发送时间: 2022年5月23日 11:07
> > > 收件人: dev@iotdb.apache.org
> > > 主题: Flush function in cluster
> > >
> > > Hi,
> > >
> > > Flush is a frequently used command in IoTDB, which flushes memtable
> > > into disk and closes all tsfiles.
> > >
> > > In the new cluster, we need to redefine this function [1].
> > >
> > > * flush: flushing current datanode
> > >
> > > * flush all/cluster: flushing all datanodes
> > >
> > > * flush sg: flush all DataRegions of a storage group
> > >
> > >
> > > What do you think?
> > >
> > > [1] https://issues.apache.org/jira/browse/IOTDB-3099
> > >
> > > ―――――――――――――――――
> > > Jialin Qiao
> > > Apache IoTDB PMC
> > >
> >
>
>