You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/17 16:21:58 UTC
[GitHub] [arrow] jorisvandenbossche commented on pull request #10118: ARROW-12468: [Python][R] Expose ScannerBuilder::UseAsync to Python & R
jorisvandenbossche commented on pull request #10118:
URL: https://github.com/apache/arrow/pull/10118#issuecomment-842458877
One concern I had when seeing the heavy parameterization (basically a lot of tests x4) is about the run test time, but checking the most expensive tests, it's basically all related to the S3 related tests:
```
$ pytest python/pyarrow/tests/test_dataset.py --durations=20
...
============================================================================================== slowest 20 durations ===============================================================================================
15.39s call pyarrow/tests/test_dataset.py::test_open_dataset_from_s3_with_filesystem_uri[threaded-sync]
15.36s call pyarrow/tests/test_dataset.py::test_open_dataset_from_s3_with_filesystem_uri[serial-sync]
15.36s call pyarrow/tests/test_dataset.py::test_open_dataset_from_s3_with_filesystem_uri[serial-async]
15.35s call pyarrow/tests/test_dataset.py::test_open_dataset_from_s3_with_filesystem_uri[threaded-async]
3.13s call pyarrow/tests/test_dataset.py::test_write_dataset_s3
2.03s setup pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3[threaded-async]
2.02s setup pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3_fsspec[threaded-async]
2.02s setup pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3[serial-async]
2.02s setup pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3_fsspec[serial-async]
2.02s setup pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3[threaded-sync]
2.02s setup pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3[serial-sync]
2.02s setup pyarrow/tests/test_dataset.py::test_write_dataset_s3
2.01s setup pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3_fsspec[serial-sync]
1.58s setup pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3_fsspec[threaded-sync]
1.07s call pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3[threaded-async]
1.06s call pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3[threaded-sync]
1.05s call pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3[serial-sync]
1.05s call pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3[serial-async]
0.47s call pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3_fsspec[threaded-async]
0.30s setup pyarrow/tests/test_dataset.py::test_make_fragment
```
So we now run those 4 times, but instead of objecting to the parameterization (since I suppose it's especially useful for S3 tests?), it's probably more useful to see why it actually takes such a long time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org