You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Luke <vi...@gmail.com> on 2020/11/09 12:03:08 UTC

pyarrow.fs.S3Filesystem retry error/warning?

python version: 3.8.6
arrow: 2.0
S3: local s3 compatible object store using ceph

example code:

from pyarrow import fs

s3 = fs.S3FileSystem()
# raw_data just some random binary data
out =
s3.open_output_stream('testbucket/results/file1',compression='zstd').write(raw_data)

prints to stdout:

/arrow/cpp/src/arrow/io/interfaces.cc:229: Error ignored when destroying
file of type N5arrow2io22CompressedOutputStreamE: IOError: When completing
multiple part upload for key 'results/file1' in bucket 'testbucket': AWS
Error [code 1]: This multipart completion is already in progress with
address : 192.168.1.100 with address 192.168.1.100
/arrow/cpp/src/arrow/io/interfaces.cc:229: Error ignored when destroying
file of type N5arrow2fs12_GLOBAL__N_118ObjectOutputStreamE: IOError: When
completing multiple part upload for key 'results/file1' in bucket
'testbucket': AWS Error [code 1]: This multipart completion is already in
progress with address : 192.168.1.100 with address : 192.168.1.100

--

The object looks right in S3 so I am guessing this is a print on a retry?
If so, any idea on why retry would be occurring.  Is there any mechanism to
tune the retries (timeouts, delay/backoff between retries, number of
retries, etc).  I see this quite a bit.  An actual fail on write would
raise an exception vs this print I am also assuming?


Thanks for this new fs interface, this is really going to help.

thanks,
Luke