You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/17 12:03:34 UTC

[GitHub] [arrow] lidavidm commented on a diff in pull request #13390: ARROW-16424: [C++] Update uri_path parsing in FromProto

lidavidm commented on code in PR #13390:
URL: https://github.com/apache/arrow/pull/13390#discussion_r900050308


##########
cpp/src/arrow/util/uri.h:
##########
@@ -68,6 +68,8 @@ class ARROW_EXPORT Uri {
   /// The URI path component.
   std::string path() const;
 
+  std::string extension() const;

Review Comment:
   nit: docstring?



##########
cpp/src/arrow/engine/substrait/relation_internal.h:
##########
@@ -24,6 +24,7 @@
 #include "arrow/engine/substrait/serde.h"
 #include "arrow/engine/substrait/visibility.h"
 #include "arrow/type_fwd.h"
+#include "arrow/util/uri.h"

Review Comment:
   nit: if this include isn't actually needed in the header, can it be placed in the .cc file instead?



##########
cpp/src/arrow/util/uri.cc:
##########
@@ -208,6 +208,8 @@ std::string Uri::path() const {
   return std::move(ss).str();
 }
 
+std::string Uri::extension() const { return impl_->path_segments_.back().to_string(); }

Review Comment:
   Also, is the last path segment really the extension? I would've expected it to be the filename…



##########
cpp/src/arrow/util/uri.cc:
##########
@@ -208,6 +208,8 @@ std::string Uri::path() const {
   return std::move(ss).str();
 }
 
+std::string Uri::extension() const { return impl_->path_segments_.back().to_string(); }

Review Comment:
   `path_segments_` could be empty, looking at the implementation.



##########
cpp/src/arrow/engine/substrait/relation_internal.cc:
##########
@@ -106,25 +106,37 @@ Result<compute::Declaration> FromProto(const substrait::Rel& rel,
           path = item.uri_path_glob();
         }
 
+        ::arrow::internal::Uri uri;
+        RETURN_NOT_OK(uri.Parse(path));
         if (item.format() ==
             substrait::ReadRel::LocalFiles::FileOrFiles::FILE_FORMAT_PARQUET) {
           format = std::make_shared<dataset::ParquetFileFormat>();
-        } else if (util::string_view{path}.ends_with(".arrow")) {
+        } else if (uri.extension() == ".arrow") {

Review Comment:
   we may also want to support ".arrows"?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org