You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "runzhiwang (Jira)" <ji...@apache.org> on 2020/11/26 13:09:00 UTC
[jira] [Issue Comment Deleted] (RATIS-1176) Benchmark various ways
to stream data
[ https://issues.apache.org/jira/browse/RATIS-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
runzhiwang updated RATIS-1176:
------------------------------
Comment: was deleted
(was: bq. I suspect it is better than transferTo(..) since it is using transferToArbitraryChannel(..).
I agree. I think ratis streaming is faster than transferToArbitraryChannel, slower than transferToTrustedChannel and transferToDirectly. Because when primary/peer receive data in ratis streaming, it need DirectByteBuffer in netty to save data, as the following image shows, so primary/peer need to read data by flow: socket -> kernel space -> user space, and when primary send data to peer, the flow is: user space -> kernel space -> socket, when primary/peer save data to disk, the flow is: user space -> kernel space -> disk, so I think ratis streaming is slower than transferToTrustedChannel which did not need copy between user space and kernel space in primary and peer.
!image-2020-11-25-07-40-50-383.png!
bq. We may pass MapByteBuffer to our writeAsync(..) method
I agree.)
> Benchmark various ways to stream data
> -------------------------------------
>
> Key: RATIS-1176
> URL: https://issues.apache.org/jira/browse/RATIS-1176
> Project: Ratis
> Issue Type: Sub-task
> Components: client, Streaming
> Reporter: Tsz-wo Sze
> Priority: Major
> Attachments: image-2020-11-25-07-40-50-383.png, screenshot-5.png, screenshot-6.png, screenshot-7.png, screenshot-8.png, screenshot-9.png
>
>
> In RATIS-1175, we provided a WritableByteChannel view of DataStreamOutput in order to support FileChannel.transferTo. However, [~runzhiwang] pointed out that sun.nio.ch.FileChannelImpl.transferTo has three submethods
> - transferToDirectly (fastest)
> - transferToTrustedChannel
> - transferToArbitraryChannel (slowest, requires buffer copying)
> Unfortunately, our current implementation only able to use transferToArbitraryChannel.
> There are several ideas below to improve the performance. We should benchmark them.
> # Improve the current implementation of WritableByteChannel so that it may be able to use a faster transferTo method.
> # Use [FileChannel.map(..)|https://docs.oracle.com/javase/8/docs/api/java/nio/channels/FileChannel.html#map-java.nio.channels.FileChannel.MapMode-long-long-] and pass MappedByteBuffer to our DataStreamOutput.writeAsync method.
> # Add a new API
> {code}
> //DataStreamOutput
> CompletableFuture<DataStreamReply> writeAsync(File);
> {code}
> Internally, use Netty DefaultFileRegion for zero-copy file transfer:
> https://github.com/netty/netty/blob/4.1/example/src/main/java/io/netty/example/file/FileServerHandler.java#L53
> The data flow of client -> primary -> peer as follows
> If stream file and do not calculate checksum, we use transferTo. In client, there are 1 DMA copy and 1 DMA gather copy, no CPU copy. In primary, there are
> 3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy.
> !screenshot-6.png!
> If stream file and calculate checksum, we use MapByteBuffer. In client, there are 2 DMA copy and 1 CPU copy. In primary, there are
> 3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy.
> !screenshot-7.png!
> If stream data not in file and calculate checksum, we use DirectByteBuffer. In client, there are 2 DMA copy and 2 CPU copy. In primary, there are
> 3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy.
> !screenshot-8.png!
> we should avoid reading data into heap such as HeapByteBuffer. In client, there are 2 DMA copy and 4 CPU copy. In primary, there are
> 3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy.
> !screenshot-9.png!
> The following is flow before ratis streaming and use ProtoBuf to send data. In client there are 2 DMA copy and 4 CPU copy. In leader, there are 3 DMA copy and 7 CPU copy. In follower, there are 2 DMA copy and 5 CPU copy.
> !screenshot-5.png!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)