You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2017/07/11 18:44:00 UTC

[jira] [Commented] (AIRFLOW-1387) Logging causes UnicodeEncodeError on wget.

    [ https://issues.apache.org/jira/browse/AIRFLOW-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16082737#comment-16082737 ] 

ASF subversion and git services commented on AIRFLOW-1387:
----------------------------------------------------------

Commit 208e9a28435c917b504af852f9d33e8ff405a296 in incubator-airflow's branch refs/heads/master from [~task_fan]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=208e9a2 ]

[AIRFLOW-1387] Add unicode string prefix

Python's format function has a feature of encoding
the supplied
variable with the same encoding as a string where
substitution will
take place. In case the string is not originally
specified as unicode
default encoding will be used. This will yield an
error if
sys.getdefaultencoding() is not 'utf-8' because it
will try to encode
previously utf8 decoded string (with unicode
chars) as non unicode.
Solution based on SO 5082452.

Closes #2426 from artiom33/logger_unicode_fix1


> Logging causes UnicodeEncodeError on wget.
> ------------------------------------------
>
>                 Key: AIRFLOW-1387
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1387
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: logging, operators, scheduler
>    Affects Versions: Airflow 1.8
>         Environment: Ubuntu 16.04 vagrant machine, python 2.7
>            Reporter: Artiom
>            Assignee: Artiom
>
> Encountered an issue breaking my DAGs after switching from 1.7.1.3
> The problem appears on my vagrant default ubuntu 16.04 machine. The output of locale command.
> LANG=en_US.UTF-8
> LANGUAGE=en_US:
> LC_CTYPE="en_US.UTF-8"
> LC_NUMERIC="en_US.UTF-8"
> LC_TIME="en_US.UTF-8"
> LC_COLLATE="en_US.UTF-8"
> LC_MONETARY="en_US.UTF-8"
> LC_MESSAGES="en_US.UTF-8"
> LC_PAPER="en_US.UTF-8"
> LC_NAME="en_US.UTF-8"
> LC_ADDRESS="en_US.UTF-8"
> LC_TELEPHONE="en_US.UTF-8"
> LC_MEASUREMENT="en_US.UTF-8"
> LC_IDENTIFICATION="en_US.UTF-8"
> LC_ALL=
> To replicate I created a DAG with single bash operator task that runs 'download.sh'
> The code for download.sh is pretty simple:
> {code:java}
> wget ftp://anonymous:guest@ftp.debian.org/debian/README.mirrors.txt
> {code}
> It breaks on the first backquote.
> {code:java}
> Jul 05 15:27:00 vagrant airflow[29929]: Exception in thread Thread-1:
> Jul 05 15:27:00 vagrant airflow[29929]: Traceback (most recent call last):
> Jul 05 15:27:00 vagrant airflow[29929]: File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
> Jul 05 15:27:00 vagrant airflow[29929]: self.run()
> Jul 05 15:27:00 vagrant airflow[29929]: File "/usr/lib/python2.7/threading.py", line 754, in run
> Jul 05 15:27:00 vagrant airflow[29929]: self._target(*self.args, **self._kwargs)
> Jul 05 15:27:00 vagrant airflow[29929]: File "/var/lib/airflow/python2.7/site-packages/airflow/task_runner/base_task_runner.py", line 95, in _read_task_logs
> Jul 05 15:27:00 vagrant airflow[29929]: self.logger.info('Subtask: {}'.format(line.rstrip('\n')))
> Jul 05 15:27:00 vagrant airflow[29929]: UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in position 58: ordinal not in range(128)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)