You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/05/17 16:41:16 UTC

[GitHub] [ozone] elek opened a new pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

elek opened a new pull request #2256:
URL: https://github.com/apache/ozone/pull/2256


   ## What changes were proposed in this pull request?
   
   This was discussed offline a few times. (cc @bshashikant @mukul1987)
   
   Today we have server/client utilities/API for Hadoop RPC and GRPC based services but we don't have any unified and tested API for data streaming.
   
   It is suggested to create a generic API for streaming.
   
   As I need to modify the closed container replication I already created a POC based on Netty (instead of GRPC).
   
   This patch shows the first proposed version.
   
    1. It uses a file-based API (the files should be specified to stream for a given id) because we can use Netty/native streaming without buffering in Java memory. 
    2. Offsets can be supported (but not supported yet).
    3. Freon test is included which shows 8x better performance compared to the old GRPC based container replication. (220Mb / sec vs 1.6-1.7 Gb/sec, using one thread + tmpfs) 
    
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-5142
   
   ## How was this patch tested?
   
   ```
   ./ozone freon strmg -n 1000 --files 10
   ```
   
   It creates a directory with 10*100MB files (1G) and replicates them 1000 times. Supposed to be finished under 10 minutes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on a change in pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
elek commented on a change in pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#discussion_r635102740



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/StreamingClient.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import io.netty.bootstrap.Bootstrap;
+import io.netty.channel.*;
+import io.netty.channel.nio.NioEventLoopGroup;
+import io.netty.channel.socket.SocketChannel;
+import io.netty.channel.socket.nio.NioSocketChannel;
+import io.netty.handler.codec.string.StringEncoder;
+import io.netty.util.CharsetUtil;
+
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.hadoop.ozone.container.stream.DirstreamServerHandler.END_MARKER;
+
+public class StreamingClient implements AutoCloseable {
+
+  private final Bootstrap bootstrap;
+  private final DirstreamClientHandler dirstreamClientHandler;
+  private EventLoopGroup group;
+  private int port;
+  private String host;
+
+  public StreamingClient(
+      String host,
+      int port,
+      StreamingDestination streamingDestination
+  ) throws InterruptedException {
+    this.port = port;
+    this.host = host;
+
+    group = new NioEventLoopGroup(100);

Review comment:
       Not really. I just used a very high number to make sure it's not a limit during the test ;-)
   
   I think we should make both the server and client more configurable in follow-up patches.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
elek commented on pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#issuecomment-843963110


   Thanks a lot for the review @bshashikant. Just pushed the fix commit. Can you PTAL?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#discussion_r637810239



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/DirectoryServerSource.java
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Streaming files from single directory.
+ */
+public class DirectoryServerSource implements StreamingSource {
+
+  private Path root;
+
+  public DirectoryServerSource(Path root) {
+    this.root = root;
+  }
+
+  @Override

Review comment:
       Thanks Marton for the explanation. Its already address it seems.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
bshashikant commented on pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#issuecomment-846912676


   Thanks @elek for the contribution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on a change in pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
elek commented on a change in pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#discussion_r635105872



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/StreamingClient.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import io.netty.bootstrap.Bootstrap;
+import io.netty.channel.*;
+import io.netty.channel.nio.NioEventLoopGroup;
+import io.netty.channel.socket.SocketChannel;
+import io.netty.channel.socket.nio.NioSocketChannel;
+import io.netty.handler.codec.string.StringEncoder;
+import io.netty.util.CharsetUtil;
+
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.hadoop.ozone.container.stream.DirstreamServerHandler.END_MARKER;
+
+public class StreamingClient implements AutoCloseable {
+
+  private final Bootstrap bootstrap;
+  private final DirstreamClientHandler dirstreamClientHandler;
+  private EventLoopGroup group;
+  private int port;
+  private String host;
+
+  public StreamingClient(
+      String host,
+      int port,
+      StreamingDestination streamingDestination
+  ) throws InterruptedException {
+    this.port = port;
+    this.host = host;
+
+    group = new NioEventLoopGroup(100);
+    dirstreamClientHandler = new DirstreamClientHandler(streamingDestination);
+    bootstrap = new Bootstrap();
+    bootstrap.group(group)
+        .channel(NioSocketChannel.class)
+        .option(ChannelOption.SO_RCVBUF, 1024 * 1024)
+        .handler(new ChannelInitializer<SocketChannel>() {
+          @Override
+          public void initChannel(SocketChannel ch) throws Exception {
+            ChannelPipeline p = ch.pipeline();
+            p.addLast(new StringEncoder(CharsetUtil.UTF_8),
+                dirstreamClientHandler
+            );
+          }
+        });
+
+  }
+
+
+  public void stream(String id) {
+    stream(id, 200L, TimeUnit.SECONDS);
+  }
+
+  public void stream(String id, long timeout, TimeUnit unit) {
+    try {
+      Channel channel = bootstrap.connect(host, port).sync().channel();
+      channel.writeAndFlush(id + "\n")
+          .await(timeout, unit);
+      channel.closeFuture().await(timeout, unit);
+      if (!dirstreamClientHandler.getCurrentFileName().equals(END_MARKER)) {
+        throw new RuntimeException("Streaming is failed. Not all files " +
+            "are streamed. Please check the log of the server." +
+            " Last (partial?) streamed file: "
+            + dirstreamClientHandler.getCurrentFileName());
+      }
+    } catch (InterruptedException e) {
+      throw new RuntimeException(e);
+    }
+  }
+
+
+  @Override
+  public void close() {
+    group.shutdownGracefully();

Review comment:
       Good point. I was fully convinced, but when I checked the javadoc to fix it, found this:
   
   ![image](https://user-images.githubusercontent.com/170549/118796880-4e1f1180-b89c-11eb-8213-c1397242bfc1.png)
   
   I think Netty developers can identify the "sensible" defaults better than me (checkedk and it's 2 sec quite period + 15 sec timeout) so I would leave it as is ;-) 
   
    Later we can also make it configurable if required... 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#discussion_r637811433



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/StreamingClient.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import io.netty.bootstrap.Bootstrap;
+import io.netty.channel.*;
+import io.netty.channel.nio.NioEventLoopGroup;
+import io.netty.channel.socket.SocketChannel;
+import io.netty.channel.socket.nio.NioSocketChannel;
+import io.netty.handler.codec.string.StringEncoder;
+import io.netty.util.CharsetUtil;
+
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.hadoop.ozone.container.stream.DirstreamServerHandler.END_MARKER;
+
+public class StreamingClient implements AutoCloseable {
+
+  private final Bootstrap bootstrap;
+  private final DirstreamClientHandler dirstreamClientHandler;
+  private EventLoopGroup group;
+  private int port;
+  private String host;
+
+  public StreamingClient(
+      String host,
+      int port,
+      StreamingDestination streamingDestination
+  ) throws InterruptedException {
+    this.port = port;
+    this.host = host;
+
+    group = new NioEventLoopGroup(100);

Review comment:
       Thanks marton. Let's file jiras to make these things configurable.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#discussion_r634912997



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/DirectoryServerSource.java
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Streaming files from single directory.
+ */
+public class DirectoryServerSource implements StreamingSource {
+
+  private Path root;
+
+  public DirectoryServerSource(Path root) {
+    this.root = root;
+  }
+
+  @Override

Review comment:
       Can we add comment on what this "id" field signify?

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/DirstreamServerHandler.java
##########
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map.Entry;
+
+import io.netty.buffer.ByteBuf;
+import io.netty.buffer.Unpooled;
+import io.netty.channel.ChannelFuture;
+import io.netty.channel.ChannelFutureListener;
+import io.netty.channel.ChannelHandlerContext;
+import io.netty.channel.ChannelInboundHandlerAdapter;
+import io.netty.channel.DefaultFileRegion;
+import io.netty.util.ByteProcessor;
+
+/**
+ * Protocol definition of the streaming.
+ */
+public class DirstreamServerHandler extends ChannelInboundHandlerAdapter {
+
+  public static final String END_MARKER = "0 END";
+
+  public static final ByteBuf END_MARKER_BUF =
+      Unpooled.wrappedBuffer(END_MARKER.getBytes(StandardCharsets.UTF_8));
+
+  private final StringBuilder id = new StringBuilder();
+
+  private StreamingSource source;
+
+  private boolean headerProcessed = false;
+
+  public DirstreamServerHandler(StreamingSource source) {
+    this.source = source;
+  }
+
+  @Override
+  public void channelRead(ChannelHandlerContext ctx, Object msg)
+      throws Exception {
+    if (!headerProcessed) {
+      ByteBuf buffer = (ByteBuf) msg;
+      int eolPosition = buffer.forEachByte(ByteProcessor.FIND_LF) - buffer
+          .readerIndex();
+      if (eolPosition > 0) {
+        headerProcessed = true;
+        id.append(buffer.toString(Charset.defaultCharset()));
+      } else {
+        id.append(buffer.toString(0, eolPosition, Charset.defaultCharset()));
+      }
+      buffer.release();
+    }
+
+    if (headerProcessed) {
+      ChannelFuture lastFuture = null;

Review comment:
       is lastFuture field required?

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/StreamingClient.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import io.netty.bootstrap.Bootstrap;
+import io.netty.channel.*;

Review comment:
       Remove * imports.

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/StreamingClient.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import io.netty.bootstrap.Bootstrap;
+import io.netty.channel.*;
+import io.netty.channel.nio.NioEventLoopGroup;
+import io.netty.channel.socket.SocketChannel;
+import io.netty.channel.socket.nio.NioSocketChannel;
+import io.netty.handler.codec.string.StringEncoder;
+import io.netty.util.CharsetUtil;
+
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.hadoop.ozone.container.stream.DirstreamServerHandler.END_MARKER;
+
+public class StreamingClient implements AutoCloseable {
+
+  private final Bootstrap bootstrap;
+  private final DirstreamClientHandler dirstreamClientHandler;
+  private EventLoopGroup group;
+  private int port;
+  private String host;
+
+  public StreamingClient(
+      String host,
+      int port,
+      StreamingDestination streamingDestination
+  ) throws InterruptedException {
+    this.port = port;
+    this.host = host;
+
+    group = new NioEventLoopGroup(100);
+    dirstreamClientHandler = new DirstreamClientHandler(streamingDestination);
+    bootstrap = new Bootstrap();
+    bootstrap.group(group)
+        .channel(NioSocketChannel.class)
+        .option(ChannelOption.SO_RCVBUF, 1024 * 1024)

Review comment:
       should we use SO_KEEP_ALIVE option as well?

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/StreamingClient.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import io.netty.bootstrap.Bootstrap;
+import io.netty.channel.*;
+import io.netty.channel.nio.NioEventLoopGroup;
+import io.netty.channel.socket.SocketChannel;
+import io.netty.channel.socket.nio.NioSocketChannel;
+import io.netty.handler.codec.string.StringEncoder;
+import io.netty.util.CharsetUtil;
+
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.hadoop.ozone.container.stream.DirstreamServerHandler.END_MARKER;
+
+public class StreamingClient implements AutoCloseable {
+
+  private final Bootstrap bootstrap;
+  private final DirstreamClientHandler dirstreamClientHandler;
+  private EventLoopGroup group;
+  private int port;
+  private String host;
+
+  public StreamingClient(
+      String host,
+      int port,
+      StreamingDestination streamingDestination
+  ) throws InterruptedException {
+    this.port = port;
+    this.host = host;
+
+    group = new NioEventLoopGroup(100);

Review comment:
       Just curious, why the event loop group is created with 100 threads? any rationale behind it?

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/DirectoryServerSource.java
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Streaming files from single directory.
+ */
+public class DirectoryServerSource implements StreamingSource {
+
+  private Path root;
+
+  public DirectoryServerSource(Path root) {
+    this.root = root;
+  }
+
+  @Override

Review comment:
       I think, it will be good to add "list of files" as a directory server source as well ? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on a change in pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
elek commented on a change in pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#discussion_r635101150



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/StreamingClient.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import io.netty.bootstrap.Bootstrap;
+import io.netty.channel.*;

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#discussion_r634928728



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/StreamingClient.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import io.netty.bootstrap.Bootstrap;
+import io.netty.channel.*;
+import io.netty.channel.nio.NioEventLoopGroup;
+import io.netty.channel.socket.SocketChannel;
+import io.netty.channel.socket.nio.NioSocketChannel;
+import io.netty.handler.codec.string.StringEncoder;
+import io.netty.util.CharsetUtil;
+
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.hadoop.ozone.container.stream.DirstreamServerHandler.END_MARKER;
+
+public class StreamingClient implements AutoCloseable {
+
+  private final Bootstrap bootstrap;
+  private final DirstreamClientHandler dirstreamClientHandler;
+  private EventLoopGroup group;
+  private int port;
+  private String host;
+
+  public StreamingClient(
+      String host,
+      int port,
+      StreamingDestination streamingDestination
+  ) throws InterruptedException {
+    this.port = port;
+    this.host = host;
+
+    group = new NioEventLoopGroup(100);
+    dirstreamClientHandler = new DirstreamClientHandler(streamingDestination);
+    bootstrap = new Bootstrap();
+    bootstrap.group(group)
+        .channel(NioSocketChannel.class)
+        .option(ChannelOption.SO_RCVBUF, 1024 * 1024)
+        .handler(new ChannelInitializer<SocketChannel>() {
+          @Override
+          public void initChannel(SocketChannel ch) throws Exception {
+            ChannelPipeline p = ch.pipeline();
+            p.addLast(new StringEncoder(CharsetUtil.UTF_8),
+                dirstreamClientHandler
+            );
+          }
+        });
+
+  }
+
+
+  public void stream(String id) {
+    stream(id, 200L, TimeUnit.SECONDS);
+  }
+
+  public void stream(String id, long timeout, TimeUnit unit) {
+    try {
+      Channel channel = bootstrap.connect(host, port).sync().channel();
+      channel.writeAndFlush(id + "\n")
+          .await(timeout, unit);
+      channel.closeFuture().await(timeout, unit);
+      if (!dirstreamClientHandler.getCurrentFileName().equals(END_MARKER)) {
+        throw new RuntimeException("Streaming is failed. Not all files " +
+            "are streamed. Please check the log of the server." +
+            " Last (partial?) streamed file: "
+            + dirstreamClientHandler.getCurrentFileName());
+      }
+    } catch (InterruptedException e) {
+      throw new RuntimeException(e);
+    }
+  }
+
+
+  @Override
+  public void close() {
+    group.shutdownGracefully();

Review comment:
       i guess, this can be blocking. It would be better to use a timeout here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant merged pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
bshashikant merged pull request #2256:
URL: https://github.com/apache/ozone/pull/2256


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on a change in pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
elek commented on a change in pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#discussion_r635103727



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/DirstreamServerHandler.java
##########
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map.Entry;
+
+import io.netty.buffer.ByteBuf;
+import io.netty.buffer.Unpooled;
+import io.netty.channel.ChannelFuture;
+import io.netty.channel.ChannelFutureListener;
+import io.netty.channel.ChannelHandlerContext;
+import io.netty.channel.ChannelInboundHandlerAdapter;
+import io.netty.channel.DefaultFileRegion;
+import io.netty.util.ByteProcessor;
+
+/**
+ * Protocol definition of the streaming.
+ */
+public class DirstreamServerHandler extends ChannelInboundHandlerAdapter {
+
+  public static final String END_MARKER = "0 END";
+
+  public static final ByteBuf END_MARKER_BUF =
+      Unpooled.wrappedBuffer(END_MARKER.getBytes(StandardCharsets.UTF_8));
+
+  private final StringBuilder id = new StringBuilder();
+
+  private StreamingSource source;
+
+  private boolean headerProcessed = false;
+
+  public DirstreamServerHandler(StreamingSource source) {
+    this.source = source;
+  }
+
+  @Override
+  public void channelRead(ChannelHandlerContext ctx, Object msg)
+      throws Exception {
+    if (!headerProcessed) {
+      ByteBuf buffer = (ByteBuf) msg;
+      int eolPosition = buffer.forEachByte(ByteProcessor.FIND_LF) - buffer
+          .readerIndex();
+      if (eolPosition > 0) {
+        headerProcessed = true;
+        id.append(buffer.toString(Charset.defaultCharset()));
+      } else {
+        id.append(buffer.toString(0, eolPosition, Charset.defaultCharset()));
+      }
+      buffer.release();
+    }
+
+    if (headerProcessed) {
+      ChannelFuture lastFuture = null;

Review comment:
       Not any more. Removed, thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on a change in pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
elek commented on a change in pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#discussion_r635100322



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/DirectoryServerSource.java
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Streaming files from single directory.
+ */
+public class DirectoryServerSource implements StreamingSource {
+
+  private Path root;
+
+  public DirectoryServerSource(Path root) {
+    this.root = root;
+  }
+
+  @Override

Review comment:
       > Can we add comment on what this "id" field signify?
   
   Sure, I added javadoc comment:
   
   ```
     /**
      * Return logicalNames and real file path to replicate.
      *
      * @param id name of the subdirectory to replitace relative to root.
      */
     public Map<String, Path> getFilesToStream(String id)
   ```
   
   In parent javadoc it's more generic:
   
   ```
     /**
      *
      * @param id: custom identifier
      *
      * @return map of files which should be copied (logical name -> real path)
      */
     Map<String, Path> getFilesToStream(String id) throws InterruptedException;
   ```
   
   > I think, it will be good to add "list of files" as a directory server source as well ?
   
   Can you please explain this in more details. I am not fully understand.
   
   The main interface here is the `StreamingSource` which can return a list of files + logical names for any logical identifier.
   
   This is a very simple implementation where the logical identifier is the subdir and all the files are replicated.
   
   For closed-container replication we need a source where the id is the container id and we will have logical names like `container.yaml`,...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on a change in pull request #2256: HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download

Posted by GitBox <gi...@apache.org>.
elek commented on a change in pull request #2256:
URL: https://github.com/apache/ozone/pull/2256#discussion_r635103396



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/stream/StreamingClient.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.container.stream;
+
+import io.netty.bootstrap.Bootstrap;
+import io.netty.channel.*;
+import io.netty.channel.nio.NioEventLoopGroup;
+import io.netty.channel.socket.SocketChannel;
+import io.netty.channel.socket.nio.NioSocketChannel;
+import io.netty.handler.codec.string.StringEncoder;
+import io.netty.util.CharsetUtil;
+
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.hadoop.ozone.container.stream.DirstreamServerHandler.END_MARKER;
+
+public class StreamingClient implements AutoCloseable {
+
+  private final Bootstrap bootstrap;
+  private final DirstreamClientHandler dirstreamClientHandler;
+  private EventLoopGroup group;
+  private int port;
+  private String host;
+
+  public StreamingClient(
+      String host,
+      int port,
+      StreamingDestination streamingDestination
+  ) throws InterruptedException {
+    this.port = port;
+    this.host = host;
+
+    group = new NioEventLoopGroup(100);
+    dirstreamClientHandler = new DirstreamClientHandler(streamingDestination);
+    bootstrap = new Bootstrap();
+    bootstrap.group(group)
+        .channel(NioSocketChannel.class)
+        .option(ChannelOption.SO_RCVBUF, 1024 * 1024)

Review comment:
       Just added, but we may need to make it configurable too. For example if closed container replication is stalled I would prefer to drop the connection instead of keeping it open... 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org