You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by jo...@apache.org on 2022/05/05 12:17:26 UTC

[arrow] branch master updated: ARROW-16458: [CI][Python] Run dask S3 tests on nightly integration

This is an automated email from the ASF dual-hosted git repository.

jorisvandenbossche pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 860ce72cd6 ARROW-16458: [CI][Python] Run dask S3 tests on nightly integration
860ce72cd6 is described below

commit 860ce72cd68a433574fc1ba98fb799e8ce751c2f
Author: Raúl Cumplido <ra...@gmail.com>
AuthorDate: Thu May 5 14:17:15 2022 +0200

    ARROW-16458: [CI][Python] Run dask S3 tests on nightly integration
    
    This PR adds coverage for running dask parquet tests that use S3 filesystem.
    
    Closes #13071 from raulcd/ARROW-16458
    
    Authored-by: Raúl Cumplido <ra...@gmail.com>
    Signed-off-by: Joris Van den Bossche <jo...@gmail.com>
---
 ci/scripts/install_dask.sh     | 3 +++
 ci/scripts/integration_dask.sh | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/ci/scripts/install_dask.sh b/ci/scripts/install_dask.sh
index 25904397a9..eb9c4e3dd4 100755
--- a/ci/scripts/install_dask.sh
+++ b/ci/scripts/install_dask.sh
@@ -33,3 +33,6 @@ elif [ "${dask}" = "latest" ]; then
 else
   pip install dask[dataframe]==${dask}
 fi
+
+# additional dependencies needed for dask's s3 tests
+pip install moto[server] flask requests
diff --git a/ci/scripts/integration_dask.sh b/ci/scripts/integration_dask.sh
index e755839718..313040014a 100755
--- a/ci/scripts/integration_dask.sh
+++ b/ci/scripts/integration_dask.sh
@@ -39,3 +39,6 @@ pytest -v --pyargs dask.dataframe.io.tests.test_orc
 # test_pandas_timestamp_overflow_pyarrow is skipped because of ARROW-15720 - can be removed once 2022.02.1 is out
 pytest -v --pyargs dask.dataframe.io.tests.test_parquet \
   -k "not test_to_parquet_pyarrow_w_inconsistent_schema_by_partition_fails_by_default and not test_timeseries_nulls_in_schema and not test_pandas_timestamp_overflow_pyarrow"
+
+# this file contains parquet tests that use S3 filesystem
+pytest -v --pyargs dask.bytes.tests.test_s3