You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ratis.apache.org by GitBox <gi...@apache.org> on 2020/12/06 06:04:04 UTC

[GitHub] [incubator-ratis] runzhiwang opened a new pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

runzhiwang opened a new pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325


   ## What changes were proposed in this pull request?
   
   Change the FileStore CLI to use Streaming
   
   ## What is the link to the Apache JIRA
   
   
   https://issues.apache.org/jira/browse/RATIS-1186
   
   ## How was this patch tested?
   
   1.
   ```
   BIN=ratis-examples/src/main/bin
   PEERS=n0:127.0.0.1:6000:7000,n1:127.0.0.1:6001:7001,n2:127.0.0.1:6002:7002
   
   ID=n0; ${BIN}/server.sh filestore server --id ${ID} --storage /tmp/ratis/${ID} --peers ${PEERS}
   ID=n1; ${BIN}/server.sh filestore server --id ${ID} --storage /tmp/ratis/${ID} --peers ${PEERS}
   ID=n2; ${BIN}/server.sh filestore server --id ${ID} --storage /tmp/ratis/${ID} --peers ${PEERS}
   ```
   
   2. 
   DataStreamApi
   ```
   ${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 1000000 --type DirectByteBuffer --peers ${PEERS}
   ${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 1000000 --type MappedByteBuffer --peers ${PEERS}
   ${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 1000000 --type transferTo --peers ${PEERS}
   ```
   AsyncApi
   `${BIN}/client.sh filestore loadgen --size 100000000 --numFiles 10 --bufferSize 4096  --peers ${PEERS}`
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang merged pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang merged pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739460991


   @szetszwo This pr can reproduce the bug of https://issues.apache.org/jira/browse/RATIS-1207.
   We just need to execute the following command two times. The first succeed, the second failed because of the duplicated stream key.
   `${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 1000000 --type DirectByteBuffer --peers ${PEERS}`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739476132


   @szetszwo Not sure what happened. But DataStreamApi with type DirectByteBuffer seems slower than AsyncApi.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on a change in pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
szetszwo commented on a change in pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#discussion_r536990821



##########
File path: ratis-examples/src/main/java/org/apache/ratis/examples/filestore/cli/DataStream.java
##########
@@ -0,0 +1,145 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.ratis.examples.filestore.cli;
+
+import com.beust.jcommander.Parameter;
+import com.beust.jcommander.Parameters;
+import org.apache.ratis.client.RaftClient;
+import org.apache.ratis.client.api.DataStreamOutput;
+import org.apache.ratis.examples.filestore.FileStoreClient;
+import org.apache.ratis.protocol.DataStreamReply;
+
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.MappedByteBuffer;
+import java.nio.channels.FileChannel;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.CompletableFuture;
+
+/**
+ * Subcommand to generate load in filestore data stream state machine.
+ */
+@Parameters(commandDescription = "Load Generator for FileStore DataStream")
+public class DataStream extends Client {
+
+  @Parameter(names = {"--type"}, description = "DirectByteBuffer, MappedByteBuffer, transferTo", required = true)
+  private String dataStreamType;
+
+  @Override
+  protected void operation(RaftClient client) throws IOException {
+    List<String> paths = generateFiles();
+    FileStoreClient fileStoreClient = new FileStoreClient(client);
+    System.out.println("Starting DataStream write now ");
+
+    long startTime = System.currentTimeMillis();
+
+    long totalWrittenBytes = waitStreamFinish(streamWrite(paths, fileStoreClient));
+
+    long endTime = System.currentTimeMillis();
+
+    System.out.println("Total files written: " + numFiles);
+    System.out.println("Each files size: " + fileSizeInBytes);
+    System.out.println("Total data written: " + totalWrittenBytes + " bytes");
+    System.out.println("Total time taken: " + (endTime - startTime) + " millis");
+
+    client.close();
+    System.exit(0);
+  }
+
+  private Map<String, List<CompletableFuture<DataStreamReply>>> streamWrite(
+      List<String> paths, FileStoreClient fileStoreClient) throws IOException {
+    Map<String, List<CompletableFuture<DataStreamReply>>> fileMap = new HashMap<>();
+    for(String path : paths) {
+      File file = new File(path);
+      FileInputStream fis = new FileInputStream(file);
+      final DataStreamOutput dataStreamOutput = fileStoreClient.getStreamOutput(path, (int) file.length());
+
+      if (dataStreamType.equals("DirectByteBuffer")) {
+        fileMap.put(path, writeByDirectByteBuffer(dataStreamOutput, fis.getChannel()));
+      } else if (dataStreamType.equals("MappedByteBuffer")) {
+        fileMap.put(path, writeByMappedByteBuffer(dataStreamOutput, fis.getChannel()));
+      } else if (dataStreamType.equals("transferTo")) {

Review comment:
       Let's call this "FilePositionCount" or "NettyFileRegion".  TransferTo sounds like FileChannel.transferTo.

##########
File path: ratis-examples/src/main/java/org/apache/ratis/examples/filestore/cli/DataStream.java
##########
@@ -0,0 +1,145 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.ratis.examples.filestore.cli;
+
+import com.beust.jcommander.Parameter;
+import com.beust.jcommander.Parameters;
+import org.apache.ratis.client.RaftClient;
+import org.apache.ratis.client.api.DataStreamOutput;
+import org.apache.ratis.examples.filestore.FileStoreClient;
+import org.apache.ratis.protocol.DataStreamReply;
+
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.MappedByteBuffer;
+import java.nio.channels.FileChannel;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.CompletableFuture;
+
+/**
+ * Subcommand to generate load in filestore data stream state machine.
+ */
+@Parameters(commandDescription = "Load Generator for FileStore DataStream")
+public class DataStream extends Client {
+
+  @Parameter(names = {"--type"}, description = "DirectByteBuffer, MappedByteBuffer, transferTo", required = true)

Review comment:
       Let's make it default to the fastest option.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739476132


   @szetszwo Not sure what happened. But DataStreamApi with type DirectByteBuffer seems slower than AsyncApi on HDD.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739463349


   AsyncApi can not work. still working on it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739476132


   @szetszwo Not sure what happened. But DataStreamApi with type DirectByteBuffer seems slower than AsyncApi on HDD in my test.
   DataStreamApi command:
   `${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 4000000 --type DirectByteBuffer --peers ${PEERS}`
   AsyncApi command:
   `${BIN}/client.sh filestore loadgen --size 100000000 --numFiles 10 --bufferSize 4000000 --peers ${PEERS}`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739476132


   @szetszwo Not sure what happened.  If 3 servers are on different machines, DataStreamApi with type DirectByteBuffer seems slower than AsyncApi.  If 3 servers are on same machine, DataStreamApi with type DirectByteBuffer is better than AsyncApi.
   DataStreamApi command:
   `${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 4000000 --type DirectByteBuffer --peers ${PEERS}`
   AsyncApi command:
   `${BIN}/client.sh filestore loadgen --size 100000000 --numFiles 10 --bufferSize 4000000 --peers ${PEERS}`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739460991


   @szetszwo This pr can reproduce the bug of https://issues.apache.org/jira/browse/RATIS-1207.
   We just need to execute the following command two time. Then duplicated stream key happens.
   `${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 1000000 --type DirectByteBuffer --peers ${PEERS}`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739486881


   The problem may be because it keeps calling ByteBuffer.allocateDirect(..), which is slow and has memory leak.  We need a buffer pool.
   ```
       ByteBuffer byteBuffer = ByteBuffer.allocateDirect(bytesToRead);
       long offset = 0L;
   
       while (fileChannel.read(byteBuffer) > 0) {
         ...
         if (bytesToRead > 0) {
           byteBuffer = ByteBuffer.allocateDirect(bytesToRead);
         }
       }
   ```
   Could you post some benchmark results on RATIS-1176?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739463349


   AsyncApi can not work. still working on it.
   `${BIN}/client.sh filestore loadgen --size 100000000 --numFiles 10 --bufferSize 4096 --peers ${PEERS}`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739460991


   @szetszwo This pr can reproduce the bug of https://issues.apache.org/jira/browse/RATIS-1207.
   We just need to execute the following command two times. Then duplicated stream key happens.
   `${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 1000000 --type DirectByteBuffer --peers ${PEERS}`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739460991


   @szetszwo This pr can reproduce the bug of https://issues.apache.org/jira/browse/RATIS-1207.
   We just need to execute the following command two times. The first succeed, the second failed because of the duplicated stream key.
   `${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 1000000 --type DirectByteBuffer --peers ${PEERS}`
   
   Tips: we can comment the console reporter to make output clear.
   ```
     @Override
     public void enableConsoleReporter(TimeDuration consoleReportRate) {
   
     //  addReporterRegistration(
      //     MetricsReporting.consoleReporter(consoleReportRate),
      //     MetricsReporting.stopConsoleReporter());
   
     }
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739476132


   @szetszwo Not sure what happened. , If 3 servers are on different machines, DataStreamApi with type DirectByteBuffer seems slower than AsyncApi.  If 3 servers are on same machine, DataStreamApi with type DirectByteBuffer is better than AsyncApi.
   DataStreamApi command:
   `${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 4000000 --type DirectByteBuffer --peers ${PEERS}`
   AsyncApi command:
   `${BIN}/client.sh filestore loadgen --size 100000000 --numFiles 10 --bufferSize 4000000 --peers ${PEERS}`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739460991


   @szetszwo This pr can reproduce the bug of https://issues.apache.org/jira/browse/RATIS-1207.
   We just need to execute the following command two times. The first succeed, the second failed because of the duplicated stream key.
   `${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 1000000 --type DirectByteBuffer --peers ${PEERS}`
   
   Tips: we can comment the console reporter to make output clear.
     @Override
     public void enableConsoleReporter(TimeDuration consoleReportRate) {
   ```
       addReporterRegistration(
           MetricsReporting.consoleReporter(consoleReportRate),
           MetricsReporting.stopConsoleReporter());
   ```
     }


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739471674


   @szetszwo Could you help review this ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739476132


   @szetszwo Not sure what happened. But DataStreamApi with type DirectByteBuffer seems slower than AsyncApi on HDD in my test. But in SSD, the DataStreamApi with type DirectByteBuffer is better than AsyncApi.
   DataStreamApi command:
   `${BIN}/client.sh filestore datastream --size 100000000 --numFiles 10 --bufferSize 4000000 --type DirectByteBuffer --peers ${PEERS}`
   AsyncApi command:
   `${BIN}/client.sh filestore loadgen --size 100000000 --numFiles 10 --bufferSize 4000000 --peers ${PEERS}`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang removed a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang removed a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739463349


   AsyncApi can not work. still working on it.
   `${BIN}/client.sh filestore loadgen --size 100000000 --numFiles 10 --bufferSize 4096 --peers ${PEERS}`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #325: RATIS-1186. Change the FileStore CLI to use Streaming

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #325:
URL: https://github.com/apache/incubator-ratis/pull/325#issuecomment-739476132


   @szetszwo Not sure what happened. But DataStreamApi with type DirectByteBuffer seems slower than AsyncApi on HDD in my test.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org