You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@iotdb.apache.org by "Haiming Zhu (Jira)" <ji...@apache.org> on 2022/11/23 02:57:00 UTC

[jira] [Commented] (IOTDB-5019) [write]data region leader write many wal files file after restarting datanode on it

    [ https://issues.apache.org/jira/browse/IOTDB-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17637548#comment-17637548 ] 

Haiming Zhu commented on IOTDB-5019:
------------------------------------

This bug is caused by issue [IOTDB-4896|https://issues.apache.org/jira/browse/IOTDB-4896], this issue skips the memtable flush procedure and forgets to call memtable flush listener to release wal.

> [write]data region leader write many wal files file after restarting datanode on it
> -----------------------------------------------------------------------------------
>
>                 Key: IOTDB-5019
>                 URL: https://issues.apache.org/jira/browse/IOTDB-5019
>             Project: Apache IoTDB
>          Issue Type: Bug
>            Reporter: changxue
>            Assignee: Quan Siyi
>            Priority: Major
>         Attachments: empty memtable.jpeg, image-2022-11-22-16-30-27-046.png, image-2022-11-22-16-53-21-186.png
>
>
> [write]data region leader write many wal files after restarting datanode on it
> environment:
> 3C3D cluster, Nov. 21
> reproduction:
> 1. Using iotdb-benchmarks write data to iotdb cluster for more than 6 hours, only 1 device 1 sensor with double values. 2 replicas.
> 2. The 46 node failed to writing data, so I restart data node of it, and it's the data region leader
> 3. Continue writing data to the same timeseries for about 8 hours. I find that most of data lay on 44 node
> 问题:
> 1. 为什么重启46前,44,46节点上的数据分布还是很均衡的,重启46后,wal文件几乎就只写在44上了呢
> 2. 为什么写了那么多的wal文件,远远大于数据数量和size
>  
> show regions(before and after restart datanode are the same):
>  |RegionId|Type|Status|Database|SeriesSlotId|TimeSlotId|DataNodeId|Host|RpcPort|Role|
> |10|SchemaRegion|Running|root.aggr.g_0|1|0|1|172.20.70.44|6667|Follower|
> |10|SchemaRegion|Running|root.aggr.g_0|1|0|5|172.20.70.46|6667|Leader|
> |11|DataRegion|Running|root.aggr.g_0|1|10|1|172.20.70.44|6667|Follower|
> |11|DataRegion|Running|root.aggr.g_0|1|10|5|172.20.70.46|6667|Leader|
> iotdb-1: 44
> iotdb-2: 45
> iotdb-3: 46
> files:
> {code:java}
> atmos@i-rh6m726k root.aggr.g_0]$ ansible allnodes -m shell -a "find $IOTDB_HOME/data/datanode/data/sequence/root.aggr.g_0 -type f |wc -l"
> iotdb-1 | CHANGED | rc=0 >>
> 1694
> iotdb-2 | CHANGED | rc=0 >>
> 966
> iotdb-3 | CHANGED | rc=0 >>
> 183
> {code}
> !image-2022-11-22-16-53-21-186.png|width=895,height=348!
> monitor:
> !image-2022-11-22-16-30-27-046.png|width=870,height=629!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)