You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/06 08:47:37 UTC

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10104: ARROW-12472: [Python] Properly convert paths to strings

jorisvandenbossche commented on a change in pull request #10104:
URL: https://github.com/apache/arrow/pull/10104#discussion_r627206249



##########
File path: python/pyarrow/tests/parquet/test_dataset.py
##########
@@ -1586,3 +1586,25 @@ def test_parquet_dataset_partitions_piece_path_with_fsspec(tempdir):
     # ensure the piece path is also posix-style
     expected = path + "/data.parquet"
     assert dataset.pieces[0].path == expected
+
+
+def test_read_table_with_fspath(tempdir):

Review comment:
       I would maybe move this test to `tests/parquet/test_basic.py`, where there is a similar `test_multiple_path_types`, which also tests the handling of path-like objects. So you can put it next to that test.

##########
File path: python/pyarrow/parquet.py
##########
@@ -1491,6 +1503,16 @@ def __init__(self, path_or_paths, filesystem=None, filters=None,
                     "Keyword '{0}' is not yet supported with the new "
                     "Dataset API".format(keyword))
 
+        if (
+            hasattr(path_or_paths, "__fspath__") and
+            filesystem is not None and
+            not _is_local_file_system(filesystem)
+        ):
+            raise TypeError(
+                "Path-like objects with __fspath__ must only be used with "
+                f"local file systems, not {type(filesystem)}"

Review comment:
       If you move this check a bit below, after the validation of filesystem keyword (which calls `_ensure_filesystem`, and this will map an fsspec LocalFileSystem to our own LocalFileSystem), then you don't need to check for fsspec, but only for our own LocalFileSystems.

##########
File path: python/pyarrow/parquet.py
##########
@@ -1471,6 +1471,18 @@ def _make_manifest(path_or_paths, fs, pathsep='/', metadata_nthreads=1,
     return pieces, partitions, common_metadata_path, metadata_path
 
 
+def _is_local_file_system(fs):
+    import fsspec

Review comment:
       The reason for the failing tests is this import, since `fsspec` is not a required depedendency, we can't assume we can import it here as such.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org