You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tubemq.apache.org by "eivenchan (Jira)" <ji...@apache.org> on 2020/06/01 09:09:00 UTC
[jira] [Commented] (TUBEMQ-123) Batch flush data to disk

    [ https://issues.apache.org/jira/browse/TUBEMQ-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120869#comment-17120869 ] 

eivenchan commented on TUBEMQ-123:
----------------------------------

Below is my propose:
 # Read the size of batche from configuration in MsgMemStore。
 # Carry corresponding size of message from cacheDataSegment and cachedIndexSegment then invoke MsgFileStore.appendMsg() in MsgMemStore.flush().
 # Refact MsgFileStore.appendMsg() to receive array of msgDataBufs and msgIndexBufs.
 # Construct byteBufferIndex of msgIndexBufs,  append msgDataBufs to curDataSeg and byteBufferIndex to curIndexSeg respectively.

 

Something else ?

> Batch flush data to disk
> ------------------------
>
>                 Key: TUBEMQ-123
>                 URL: https://issues.apache.org/jira/browse/TUBEMQ-123
>             Project: Apache TubeMQ
>          Issue Type: Sub-task
>            Reporter: Guocheng Zhang
>            Priority: Major
>         Attachments: image-2020-05-16-21-56-43-218.png
>
>
> 4. More effective memory-to-disk operation: At present, the flashing operation is to flash messages from the memory to the disk one by one for storage. This block can be adjusted to write to the disk in batches according to the memory block, thereby improving storage efficiency;
>  
> ------------------------------------------------------
> This problem was pointed out by an MQ expert: in the current version, TubeMQ does not handle the data well enough when refreshing the data from memory to disk, It is carried out through the refresh mode one by one, the related problems are shown in the following figure:
> !image-2020-05-16-21-56-43-218.png!
>  
>  After searching for documents and analyzing this problem, This should be a better practice method that datas writing to disk according to the size of 4 times the number of bytes, but the difference between the specific size and the effect of flushing is related to the relevant operating environment.
> So I want to optimize this to flush the datas to the disk in batches of a specified size (configurable) to improve disk write efficiency, and combined with the modification of [TUBEMQ-120], it should be able to achieve better results



--
This message was sent by Atlassian Jira
(v8.3.4#803005)