You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Adam Wentz (JIRA)" <ji...@apache.org> on 2017/09/01 19:38:00 UTC
[jira] [Updated] (AIRFLOW-1558) S3FileTransformOperator fails in
Python 3 due to file mode
[ https://issues.apache.org/jira/browse/AIRFLOW-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Adam Wentz updated AIRFLOW-1558:
--------------------------------
Description:
When running under python3 the S3FileTransformOperator fails with the following error:
{noformat}
[2017-09-01 18:44:54,440] {models.py:1427} ERROR - write() argument must be str, not bytes
[2017-09-01 18:44:54,443] {base_task_runner.py:95} INFO - Subtask: Traceback (most recent call last):
[2017-09-01 18:44:54,444] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1384, in run
[2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: result = task_copy.execute(context=context)
[2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/site-packages/airflow/operators/s3_file_transform_operator.py", line 87, in execute
[2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: source_s3_key_object.get_contents_to_file(f_source)
[2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1662, in get_contents_to_file
[2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: response_headers=response_headers)
[2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1494, in get_file
[2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: query_args=None)
[2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1548, in _get_file_internal
[2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: fp.write(bytes)
[2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/tempfile.py", line 483, in func_wrapper
[2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: return func(*args, **kwargs)
[2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: TypeError: write() argument must be str, not bytes
[2017-09-01 18:44:54,450] {base_task_runner.py:95} INFO - Subtask: [2017-09-01 18:44:54,443] {models.py:1451} INFO - Marking task as FAILED.
{noformat}
The solution is to open the `NamedTemporaryFile`s with mode `wb` rather than `w`. I have an incoming PR for this.
was:
When running under python3 the S3FileTransformOperator fails with the following error:
```
[2017-09-01 18:44:54,440] {models.py:1427} ERROR - write() argument must be str, not bytes
[2017-09-01 18:44:54,443] {base_task_runner.py:95} INFO - Subtask: Traceback (most recent call last):
[2017-09-01 18:44:54,444] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1384, in run
[2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: result = task_copy.execute(context=context)
[2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/site-packages/airflow/operators/s3_file_transform_operator.py", line 87, in execute
[2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: source_s3_key_object.get_contents_to_file(f_source)
[2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1662, in get_contents_to_file
[2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: response_headers=response_headers)
[2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1494, in get_file
[2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: query_args=None)
[2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1548, in _get_file_internal
[2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: fp.write(bytes)
[2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/tempfile.py", line 483, in func_wrapper
[2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: return func(*args, **kwargs)
[2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: TypeError: write() argument must be str, not bytes
[2017-09-01 18:44:54,450] {base_task_runner.py:95} INFO - Subtask: [2017-09-01 18:44:54,443] {models.py:1451} INFO - Marking task as FAILED.
```
The solution is to open the `NamedTemporaryFile`s with mode `wb` rather than `w`. I have an incoming PR for this.
> S3FileTransformOperator fails in Python 3 due to file mode
> ----------------------------------------------------------
>
> Key: AIRFLOW-1558
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1558
> Project: Apache Airflow
> Issue Type: Bug
> Components: operators
> Affects Versions: Airflow 1.8
> Environment: python3
> Reporter: Adam Wentz
> Priority: Minor
>
> When running under python3 the S3FileTransformOperator fails with the following error:
> {noformat}
> [2017-09-01 18:44:54,440] {models.py:1427} ERROR - write() argument must be str, not bytes
> [2017-09-01 18:44:54,443] {base_task_runner.py:95} INFO - Subtask: Traceback (most recent call last):
> [2017-09-01 18:44:54,444] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1384, in run
> [2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: result = task_copy.execute(context=context)
> [2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/site-packages/airflow/operators/s3_file_transform_operator.py", line 87, in execute
> [2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: source_s3_key_object.get_contents_to_file(f_source)
> [2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1662, in get_contents_to_file
> [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: response_headers=response_headers)
> [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1494, in get_file
> [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: query_args=None)
> [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1548, in _get_file_internal
> [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: fp.write(bytes)
> [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/tempfile.py", line 483, in func_wrapper
> [2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: return func(*args, **kwargs)
> [2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: TypeError: write() argument must be str, not bytes
> [2017-09-01 18:44:54,450] {base_task_runner.py:95} INFO - Subtask: [2017-09-01 18:44:54,443] {models.py:1451} INFO - Marking task as FAILED.
> {noformat}
> The solution is to open the `NamedTemporaryFile`s with mode `wb` rather than `w`. I have an incoming PR for this.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)