You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Song Ziyang (Jira)" <ji...@apache.org> on 2022/10/11 16:46:00 UTC

[jira] [Comment Edited] (RATIS-1717) Reduce data copy when serialize LogEntry to RaftLog

    [ https://issues.apache.org/jira/browse/RATIS-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615963#comment-17615963 ] 

Song Ziyang edited comment on RATIS-1717 at 10/11/22 4:45 PM:
--------------------------------------------------------------

[~szetszwo] Oops, we {*}cannot directly serialize LogEntryProto to directByteBuffer due to checksum{*}. Checksum class in JDK8 cannot calculate checksum on byte sequence sitting in direct memory. Therefore, a copy to heap byte array seems inevitable, unless we dare to *use Unsafe operations* to checksum on direct memory :).

Still, we can allocate a global buffer (4M) instead of allocating a temp array every write. Simulation results show that this did reduce young GC. I've submitted https://issues.apache.org/jira/secure/attachment/13050789/idea.patch, please take a look. Also, Should we try Unsafe operations?

 


was (Author: JIRAUSER281912):
[~szetszwo] Oops, we {*}cannot directly serialize LogEntryProto to directByteBuffer due to checksum{*}. Checksum class in JDK8 cannot calculate checksum on byte sequence sitting in direct memory. Therefore, a copy to heap byte array seems inevitable, unless we dare to *use Unsafe operations* to checksum on direct memory :).

Still, we can allocate a global buffer (4M) instead of allocating a temp array every write. Simulation results show that this did reduce young GC. I've submitted an [idea patch]([^idea.patch)]). Please take a look. Also, Should we try Unsafe operations?

 

> Reduce data copy when serialize LogEntry to RaftLog
> ---------------------------------------------------
>
>                 Key: RATIS-1717
>                 URL: https://issues.apache.org/jira/browse/RATIS-1717
>             Project: Ratis
>          Issue Type: Improvement
>          Components: performance
>            Reporter: Song Ziyang
>            Priority: Major
>         Attachments: idea.patch
>
>
> In current implementation, before a log entry is serialized to RaftLog, it first needs to be serialized to a byte array[1], and then be written to FileChannel using directByteBuffer[2]. It requires two data copies, one from LogEntryProto to byte array, one from byte array to direct ByteBuffer.
> We can reduce the data copy by *serializing LogEntry to directByteBuffer, removing the byte array:*
>  # We need a directByteBuffer large enough to hold a single log (maxBufferSize).
>  # When we need to serialize a log entry to RaftLog, first we check whether there's enough space remained in directByteBuffer. If not, we flush the directByteBuffer to FileChannel. Otherwise we serialize the data directly into directByteBuffer.
>  # When data buffered in directByteBuffer reaches the threshold (say 64KB), we flush the directByteBuffer, the same as current implementation.
> [1] [https://github.com/apache/ratis/blob/289db3ae64e8cf620eba882468dda661af0439bc/ratis-server/src/main/java/org/apache/ratis/server/raftlog/segmented/SegmentedRaftLogOutputStream.java#L96]
> [2] [https://github.com/apache/ratis/blob/289db3ae64e8cf620eba882468dda661af0439bc/ratis-server/src/main/java/org/apache/ratis/server/raftlog/segmented/BufferedWriteChannel.java#L65]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)