You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Timothy Luna (Jira)" <ji...@apache.org> on 2022/05/02 15:49:00 UTC
[jira] [Created] (ARROW-16437) Mocking tests with moto not currently feasible.
Timothy Luna created ARROW-16437:
------------------------------------
Summary: Mocking tests with moto not currently feasible.
Key: ARROW-16437
URL: https://issues.apache.org/jira/browse/ARROW-16437
Project: Apache Arrow
Issue Type: Improvement
Components: Python
Affects Versions: 7.0.0
Environment: Ubuntu environment
Python 3.9
PyArrow 7.0.0
moto 3.1.7
Reporter: Timothy Luna
## Unable to use moto to mock S3 for testing purposes.
I've been using AWSWrangler as a loading utility in a custom application and am attempting to remove it as a dependency because PyArrow Dataset is capable of providing all the s3 functionality I need.
The issue stems from the fact that when PyArrow attempts to determine the FileSystem type it appears to be sidestepping moto and is failing with:
```sh
============================================================================================================================ FAILURES ============================================================================================================================
_____________________________________________________________________________________________________________________ test__pull_cached_data _____________________________________________________________________________________________________________________
@pytest.mark.usefixtures("s3")
def test__pull_cached_data():
"""Tests pull cached data, both happy and sad."""
# Here we're going to make a folder, transfer in some files,
# and pull them!
with tempfile.TemporaryDirectory() as t:
# This commented code functions.
# from awswrangler.s3 import read_parquet
# sillything = read_parquet('s3://test-bucket/test_metadata.parquet')
> a = ds.dataset('s3://test-bucket/test_metadata.parquet')
tests/custom_application/loading/test_load.py:156:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.9/site-packages/pyarrow/dataset.py:667: in dataset
return _filesystem_dataset(source, **kwargs)
/usr/local/lib/python3.9/site-packages/pyarrow/dataset.py:412: in _filesystem_dataset
fs, paths_or_selector = _ensure_single_source(source, filesystem)
/usr/local/lib/python3.9/site-packages/pyarrow/dataset.py:373: in _ensure_single_source
filesystem, path = _resolve_filesystem_and_path(path, filesystem)
/usr/local/lib/python3.9/site-packages/pyarrow/fs.py:179: in _resolve_filesystem_and_path
filesystem, path = FileSystem.from_uri(path)
pyarrow/_fs.pyx:350: in pyarrow._fs.FileSystem.from_uri
???
pyarrow/error.pxi:143: in pyarrow.lib.pyarrow_internal_check_status
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E OSError: When resolving region for bucket 'test-bucket': AWS Error [code 99]: curlCode: 35, SSL connect error
pyarrow/error.pxi:114: OSError
--------------------------------------------------------------------------------------------------------------------- Captured stderr setup ----------------------------------------------------------------------------------------------------------------------
INFO:botocore.credentials:Found credentials in environment variables.
----------------------------------------------------------------------------------------------------------------------- Captured log setup -----------------------------------------------------------------------------------------------------------------------
INFO botocore.credentials:credentials.py:1114 Found credentials in environment variables.
==================================================================================================================== short test summary info =====================================================================================================================
FAILED tests/custom_application/loading/test_load.py::test__pull_cached_data - OSError: When resolving region for bucket 'test-bucket': AWS Error [code 99]: curlCode: 35, SSL connect error
======================================================================================================================= 1 failed in 17.82s =======================================================================================================================
```
Please let me know if you need additional information!
--
This message was sent by Atlassian Jira
(v8.20.7#820007)