You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2022/05/13 08:16:46 UTC

[GitHub] [nifi-minifi-cpp] adamdebreceni opened a new pull request, #1332: MINIFICPP-1831 - Download assets through the c2 protocol

adamdebreceni opened a new pull request, #1332:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1332

   Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced
        in the commit message?
   
   - [ ] Does your PR title start with MINIFICPP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
   
   - [ ] Has your PR been rebased against the latest commit within the target branch (typically main)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   ### For code changes:
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the LICENSE file?
   - [ ] If applicable, have you updated the NOTICE file?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a diff in pull request #1332: MINIFICPP-1831 - Download assets through the c2 protocol

Posted by GitBox <gi...@apache.org>.
adamdebreceni commented on code in PR #1332:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1332#discussion_r876735094


##########
extensions/http-curl/tests/C2UpdateAssetTest.cpp:
##########
@@ -0,0 +1,260 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#undef NDEBUG
+#include <vector>
+#include <string>
+#include <fstream>
+#include <iterator>
+
+#include "HTTPIntegrationBase.h"
+#include "HTTPHandlers.h"
+#include "utils/IntegrationTestUtils.h"
+#include "utils/Environment.h"
+
+class FileProvider : public ServerAwareHandler {
+ public:
+  explicit FileProvider(std::string file_content): file_content_(std::move(file_content)) {}
+
+  bool handleGet(CivetServer* /*server*/, struct mg_connection* conn) override {
+    mg_printf(conn, "HTTP/1.1 200 OK\r\nContent-Type: "
+                    "text/plain\r\nContent-Length: %lu\r\nConnection: close\r\n\r\n",
+              file_content_.length());
+    mg_printf(conn, "%s", file_content_.c_str());
+    return true;
+  }
+
+ private:
+  std::string file_content_;
+};
+
+class C2HeartbeatHandler : public HeartbeatHandler {
+ public:
+  using HeartbeatHandler::HeartbeatHandler;
+
+  bool handlePost(CivetServer* /*server*/, struct mg_connection* conn) override {
+    std::lock_guard<std::mutex> guard(op_mtx_);
+    sendHeartbeatResponse(operations_, conn);
+    operations_.clear();
+    return true;
+  }
+
+  void addOperation(std::string id, std::unordered_map<std::string, std::string> args) {
+    std::lock_guard<std::mutex> guard(op_mtx_);
+    operations_.push_back(C2Operation{
+      .operation = "update",
+      .operand = "asset",
+      .operation_id = std::move(id),
+      .args = std::move(args)
+    });
+  }
+
+ private:
+  std::mutex op_mtx_;
+  std::vector<C2Operation> operations_;
+};
+
+class VerifyC2AssetUpdate : public VerifyC2Base {
+ public:
+  void configureC2() override {
+    configuration->set("nifi.c2.agent.protocol.class", "RESTSender");
+    configuration->set("nifi.c2.enable", "true");
+    configuration->set("nifi.c2.agent.heartbeat.period", "100");

Review Comment:
   could you expand on this? I tried decreasing the heartbeat period to 10ms (to trigger something) and did not notice any difference in the test's behavior, `C2Agent::consume` processes (at most) 1 response every `C2RESPONSE_POLL_MS`, and there does not seem to be a logic to process all remaining items from `C2Agent::responses` in `C2Agent::stop`, `C2Agent::produce` can send up to `max_c2_responses` (seems to be a constant `5`) responses (acks to operations) and performs 1 heartbeat (this adds 1 response to be consumed by `C2Agent::consume`)
   
   I see that the `restart_needed_` flag is only acted upon when the `requests` is empty, so I think this latency is only exclusive to the restart tests



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi-minifi-cpp] szaszm closed pull request #1332: MINIFICPP-1831 - Download assets through the c2 protocol

Posted by GitBox <gi...@apache.org>.
szaszm closed pull request #1332: MINIFICPP-1831 - Download assets through the c2 protocol
URL: https://github.com/apache/nifi-minifi-cpp/pull/1332


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi-minifi-cpp] szaszm commented on a diff in pull request #1332: MINIFICPP-1831 - Download assets through the c2 protocol

Posted by GitBox <gi...@apache.org>.
szaszm commented on code in PR #1332:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1332#discussion_r877257128


##########
extensions/http-curl/tests/C2UpdateAssetTest.cpp:
##########
@@ -0,0 +1,260 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#undef NDEBUG
+#include <vector>
+#include <string>
+#include <fstream>
+#include <iterator>
+
+#include "HTTPIntegrationBase.h"
+#include "HTTPHandlers.h"
+#include "utils/IntegrationTestUtils.h"
+#include "utils/Environment.h"
+
+class FileProvider : public ServerAwareHandler {
+ public:
+  explicit FileProvider(std::string file_content): file_content_(std::move(file_content)) {}
+
+  bool handleGet(CivetServer* /*server*/, struct mg_connection* conn) override {
+    mg_printf(conn, "HTTP/1.1 200 OK\r\nContent-Type: "
+                    "text/plain\r\nContent-Length: %lu\r\nConnection: close\r\n\r\n",
+              file_content_.length());
+    mg_printf(conn, "%s", file_content_.c_str());
+    return true;
+  }
+
+ private:
+  std::string file_content_;
+};
+
+class C2HeartbeatHandler : public HeartbeatHandler {
+ public:
+  using HeartbeatHandler::HeartbeatHandler;
+
+  bool handlePost(CivetServer* /*server*/, struct mg_connection* conn) override {
+    std::lock_guard<std::mutex> guard(op_mtx_);
+    sendHeartbeatResponse(operations_, conn);
+    operations_.clear();
+    return true;
+  }
+
+  void addOperation(std::string id, std::unordered_map<std::string, std::string> args) {
+    std::lock_guard<std::mutex> guard(op_mtx_);
+    operations_.push_back(C2Operation{
+      .operation = "update",
+      .operand = "asset",
+      .operation_id = std::move(id),
+      .args = std::move(args)
+    });
+  }
+
+ private:
+  std::mutex op_mtx_;
+  std::vector<C2Operation> operations_;
+};
+
+class VerifyC2AssetUpdate : public VerifyC2Base {
+ public:
+  void configureC2() override {
+    configuration->set("nifi.c2.agent.protocol.class", "RESTSender");
+    configuration->set("nifi.c2.enable", "true");
+    configuration->set("nifi.c2.agent.heartbeat.period", "100");

Review Comment:
   I was thinking that if `produce` adds a new heartbeat every 100ms, and `consume` consumes one every 100ms, then it might be possible that `responses.size()` stay > 0 long term, and `consume` gains N * 100ms latency. Or at least that's what I thought, but I couldn't trigger any issues now, so I'm probably misremembering some detail. Feel free to ignore this. I don't think it would be an issue in this case anyway.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [nifi-minifi-cpp] szaszm commented on a diff in pull request #1332: MINIFICPP-1831 - Download assets through the c2 protocol

Posted by GitBox <gi...@apache.org>.
szaszm commented on code in PR #1332:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1332#discussion_r876454396


##########
libminifi/src/c2/C2Agent.cpp:
##########
@@ -897,6 +901,113 @@ bool C2Agent::handleConfigurationUpdate(const C2ContentResponse &resp) {
   return true;
 }
 
+static auto make_path(const std::string& str) {
+  return std::filesystem::path(str);
+}
+
+static std::optional<std::string> validateFilePath(const std::filesystem::path& path) {
+  if (path.empty()) {
+    return "Empty file path";
+  }
+  if (!path.is_relative()) {
+    return "File path must be a relative path '" + path.string() + "'";
+  }
+  if (!path.has_filename()) {
+    return "Filename missing in output path '" + path.string() + "'";
+  }
+  if (path.filename() == "." || path.filename() == "..") {
+    return "Invalid filename '" + path.filename().string() + "'";
+  }
+  for (const auto& segment : path) {
+    if (segment == "..") {
+      return "Accessing parent directory is forbidden in file path '" + path.string() + "'";
+    }
+  }
+  return std::nullopt;
+}
+
+void C2Agent::handleAssetUpdate(const C2ContentResponse& resp) {
+  auto send_error = [&] (std::string_view error) {
+    logger_->log_error("%s", std::string(error));
+    C2Payload response(Operation::ACKNOWLEDGE, state::UpdateState::SET_ERROR, resp.ident, true);
+    response.setRawData(gsl::span<const char>(error).as_span<const std::byte>());
+    enqueue_c2_response(std::move(response));
+  };
+  std::filesystem::path asset_dir = std::filesystem::path(configuration_->getHome()) / "asset";
+  if (auto asset_dir_str = configuration_->get(Configuration::nifi_asset_directory)) {
+    asset_dir = asset_dir_str.value();
+  }
+
+  // output file
+  std::filesystem::path file_path;
+  if (auto file_rel = resp.getArgument("file") | utils::map(make_path)) {
+    if (auto error = validateFilePath(file_rel.value())) {
+      send_error(error.value());
+      return;
+    }
+    file_path = asset_dir / file_rel.value();
+  } else {
+    send_error("Couldn't find 'file' argument");
+    return;
+  }
+
+  // source url
+  std::string url;
+  if (auto url_str = resp.getArgument("url")) {
+    if (auto resolved_url = resolveUrl(*url_str)) {
+      url = resolved_url.value();
+    } else {
+      send_error("Couldn't resolve url");
+      return;
+    }
+  } else {
+    send_error("Couldn't find 'url' argument");
+    return;
+  }
+
+  // forceDownload
+  bool force_download = false;
+  if (auto force_download_str = resp.getArgument("forceDownload")) {
+    if (utils::StringUtils::equalsIgnoreCase(force_download_str.value(), "true")) {
+      force_download = true;
+    } else if (utils::StringUtils::equalsIgnoreCase(force_download_str.value(), "false")) {
+      force_download = false;
+    } else {
+      send_error("Argument 'forceDownload' must be either 'true' or 'false'");
+      return;
+    }
+  }
+
+  if (!force_download && std::filesystem::exists(file_path)) {
+    logger_->log_info("File already exists");
+    C2Payload response(Operation::ACKNOWLEDGE, state::UpdateState::NO_OPERATION, resp.ident, true);
+    enqueue_c2_response(std::move(response));
+    return;
+  }
+
+  C2Payload&& file_response = protocol_.load()->consumePayload(url, C2Payload(Operation::TRANSFER, true), RECEIVE, false);

Review Comment:
   I know this is a pattern in this part of the codebase, but can we avoid relying on reference lifetime extension in new code when we can just as easily just make a proper object?
   ```suggestion
     C2Payload file_response = protocol_.load()->consumePayload(url, C2Payload(Operation::TRANSFER, true), RECEIVE, false);
   ```



##########
extensions/http-curl/tests/C2UpdateAssetTest.cpp:
##########
@@ -0,0 +1,260 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#undef NDEBUG
+#include <vector>
+#include <string>
+#include <fstream>
+#include <iterator>
+
+#include "HTTPIntegrationBase.h"
+#include "HTTPHandlers.h"
+#include "utils/IntegrationTestUtils.h"
+#include "utils/Environment.h"
+
+class FileProvider : public ServerAwareHandler {
+ public:
+  explicit FileProvider(std::string file_content): file_content_(std::move(file_content)) {}
+
+  bool handleGet(CivetServer* /*server*/, struct mg_connection* conn) override {
+    mg_printf(conn, "HTTP/1.1 200 OK\r\nContent-Type: "
+                    "text/plain\r\nContent-Length: %lu\r\nConnection: close\r\n\r\n",
+              file_content_.length());
+    mg_printf(conn, "%s", file_content_.c_str());
+    return true;
+  }
+
+ private:
+  std::string file_content_;
+};
+
+class C2HeartbeatHandler : public HeartbeatHandler {
+ public:
+  using HeartbeatHandler::HeartbeatHandler;
+
+  bool handlePost(CivetServer* /*server*/, struct mg_connection* conn) override {
+    std::lock_guard<std::mutex> guard(op_mtx_);
+    sendHeartbeatResponse(operations_, conn);
+    operations_.clear();
+    return true;
+  }
+
+  void addOperation(std::string id, std::unordered_map<std::string, std::string> args) {
+    std::lock_guard<std::mutex> guard(op_mtx_);
+    operations_.push_back(C2Operation{
+      .operation = "update",
+      .operand = "asset",
+      .operation_id = std::move(id),
+      .args = std::move(args)
+    });
+  }
+
+ private:
+  std::mutex op_mtx_;
+  std::vector<C2Operation> operations_;
+};
+
+class VerifyC2AssetUpdate : public VerifyC2Base {
+ public:
+  void configureC2() override {
+    configuration->set("nifi.c2.agent.protocol.class", "RESTSender");
+    configuration->set("nifi.c2.enable", "true");
+    configuration->set("nifi.c2.agent.heartbeat.period", "100");

Review Comment:
   Consider slightly increasing the heartbeat period. The C2Agent consumer poll rate (C2Agent.h `C2RESPONSE_POLL_MS`) is once every 100ms, and it can happen that the consumer can't keep up with the producer, introducing some latency. I had this issue with the restart after property update, that never happened, because there were always tasks left to do in the consumer queue.
   It wouldn't break the test here, but I'd like to avoid this problem if possible.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org