You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/22 18:41:46 UTC

[GitHub] [arrow] pitrou commented on a change in pull request #10574: ARROW-12790: [C++] Improve HadoopFileSystem conformance

pitrou commented on a change in pull request #10574:
URL: https://github.com/apache/arrow/pull/10574#discussion_r656488996



##########
File path: cpp/src/arrow/filesystem/hdfs.cc
##########
@@ -134,21 +140,39 @@ class HadoopFileSystem::Impl {
     }
     std::vector<FileInfo> results;
 
+    // Fetch working directory.
+    // If select.base_dir is relative, we need to trim it from the start
+    // of paths returned by ListDirectory.
+    // If select.base_dir is absolute, we need to trim the "URI authority"
+    // portion of the working directory.
     std::string wd;
-    if (select.base_dir.empty() || select.base_dir.front() != '/') {
-      // Fetch working directory, because we need to trim it from the start
-      // of paths returned by ListDirectory as select.base_dir is relative.
-      RETURN_NOT_OK(client_->GetWorkingDirectory(&wd));
-      Uri wd_uri;
-      RETURN_NOT_OK(wd_uri.Parse(wd));
-      wd = wd_uri.path();
+    RETURN_NOT_OK(client_->GetWorkingDirectory(&wd));
+
+    if (!select.base_dir.empty() && select.base_dir.front() == '/') {
+      // base_dir is absolute, only keep the URI authority portion.
+      // As mentioned in StatSelector() above, the URI may contain unescaped
+      // special chars and therefore may not be a valid URI, so we parse by hand.
+      auto pos = wd.find("://");  // start of host:port portion
+      if (pos == std::string::npos) {
+        return Status::Invalid("xxx");

Review comment:
       Well, of course, they're just terse :-D
   I'll fix them tomorrow.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org