You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/07/30 09:52:04 UTC
[GitHub] [airflow] EtsuNDmA opened a new issue #17336: Celery worker fails with RecursionError when asctime is in log_format but is not in json_fields
EtsuNDmA opened a new issue #17336:
URL: https://github.com/apache/airflow/issues/17336
- **Apache Airflow version**: 2.1.2
- **Python**: Python 3.8.5
- **CeleryExecutor**
**Environment**:
- **OS** (e.g. from /etc/os-release): Debian GNU/Linux 10 (buster)
- **Kernel** (e.g. `uname -a`): Linux 074f2dc34905 5.10.25-linuxkit
**What happened**:
We use remote logging to elasticsearch. After upgrade from 1.10.15 to 2.1.2 celery worker fails on each task when the `logging.log_format` contains `asctime` but the `elasticsearch.json_fields` don't
Worker logs:
```
[2021-07-30 09:29:04,120: ERROR/ForkPoolWorker-1] Failed to execute task maximum recursion depth exceeded while calling a Python object.
Traceback (most recent call last):
File "/usr/local/lib/python3.8/logging/__init__.py", line 436, in format
return self._format(record)
File "/usr/local/lib/python3.8/logging/__init__.py", line 432, in _format
return self._fmt % record.__dict__
KeyError: 'asctime'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/logging/__init__.py", line 1081, in emit
msg = self.format(record)
File "/usr/local/lib/python3.8/logging/__init__.py", line 925, in format
return fmt.format(record)
File "/airflow/env/lib/python3.8/site-packages/airflow/utils/log/json_formatter.py", line 43, in format
super().format(record)
File "/usr/local/lib/python3.8/logging/__init__.py", line 667, in format
s = self.formatMessage(record)
File "/usr/local/lib/python3.8/logging/__init__.py", line 636, in formatMessage
return self._style.format(record)
File "/usr/local/lib/python3.8/logging/__init__.py", line 438, in format
raise ValueError('Formatting field not found in record: %s' % e)
ValueError: Formatting field not found in record: 'asctime'
... repeat many times until RecursionError
File "/usr/local/lib/python3.8/posixpath.py", line 143, in basename
sep = _get_sep(p)
File "/usr/local/lib/python3.8/posixpath.py", line 42, in _get_sep
if isinstance(path, bytes):
RecursionError: maximum recursion depth exceeded while calling a Python object
[2021-07-30 09:29:04,159: ERROR/ForkPoolWorker-1] Task airflow.executors.celery_executor.execute_command[505225e3-267c-4661-87af-8618a4f7d202] raised unexpected: AirflowException('Celery command failed on host: b0b216e2ccea')
Traceback (most recent call last):
File "/airflow/env/lib/python3.8/site-packages/celery/app/trace.py", line 412, in trace_task
R = retval = fun(*args, **kwargs)
File "/airflow/env/lib/python3.8/site-packages/celery/app/trace.py", line 704, in __protected_call__
return self.run(*args, **kwargs)
File "/airflow/env/lib/python3.8/site-packages/airflow/executors/celery_executor.py", line 88, in execute_command
_execute_in_fork(command_to_exec)
File "/airflow/env/lib/python3.8/site-packages/airflow/executors/celery_executor.py", line 99, in _execute_in_fork
raise AirflowException('Celery command failed on host: ' + get_hostname())
airflow.exceptions.AirflowException: Celery command failed on host: b0b216e2ccea
```
My logging setting are
```
[logging]
remote_logging = True
log_format = [%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s
[elasticsearch]
host = http://localhost:9200
log_id_template = {dag_id}-{task_id}-{execution_date}-{try_number}
end_of_log_mark = end_of_log
write_stdout = True
json_format = True
# There is no `asctime` here
json_fields = filename, lineno, levelname, message
host_field = host
offset_field = offset
```
We use custom log_config.py to write all logs to syslog
```python
import copy
import sys
from airflow.config_templates.airflow_local_settings import DEFAULT_LOGGING_CONFIG
# we don't want airflow to write to files
# force stdout or elastic only
LOGGING_CONFIG = copy.deepcopy(DEFAULT_LOGGING_CONFIG)
for handler, config in LOGGING_CONFIG['handlers'].items():
if (handler in ('console', 'processor', 'processor_manager')
or handler == 'task' and config['class'] == 'airflow.utils.log.file_task_handler.FileTaskHandler'):
LOGGING_CONFIG['handlers'][handler] = {
# default airflow.utils.log.logging_mixin.RedirectStdHandler breaks DAG processor
'class': 'logging.StreamHandler',
'formatter': 'airflow',
'stream': sys.stdout,
'filters': ['mask_secrets'],
}
```
If I add `asctime` to `json_fields` or remove from `log_format` or disable remote logging, everything works as expected
**What you expected to happen**:
Celery worker should not fail
**How to reproduce it**:
Config logging and elasticsearch as in the example above. Execute any DAG. Look at the worker logs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on issue #17336: Celery worker fails with RecursionError when asctime is in log_format but is not in json_fields
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #17336:
URL: https://github.com/apache/airflow/issues/17336#issuecomment-1030717926
This issue has been closed because it has not received response from the issue author.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] github-actions[bot] closed issue #17336: Celery worker fails with RecursionError when asctime is in log_format but is not in json_fields
Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #17336:
URL: https://github.com/apache/airflow/issues/17336
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kaxil commented on issue #17336: Celery worker fails with RecursionError when asctime is in log_format but is not in json_fields
Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #17336:
URL: https://github.com/apache/airflow/issues/17336#issuecomment-1003098863
@EtsuNDmA Can you test again with latest version of Elasticsearch provider and Airflow 2.1.4+
(Potentially) Related PR:
- https://github.com/apache/airflow/pull/15163
- https://github.com/apache/airflow/pull/15414
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] EtsuNDmA commented on issue #17336: Celery worker fails with RecursionError when asctime is in log_format but is not in json_fields
Posted by GitBox <gi...@apache.org>.
EtsuNDmA commented on issue #17336:
URL: https://github.com/apache/airflow/issues/17336#issuecomment-917916675
@eladkal the way I see it this comment before `json_format` says that json specific fields will be used instead of default `log_format`
```
# Instead of the default log formatter, write the log lines as JSON
json_format = True
```
It's confusing that we should change json_fields in elasticsearh section and sync it with log_format in logging section. `log_format` furthermore used by many other handlers such as `processor_manager` and `processor`. I think that `json_fields` should be a subset of `log_format`, but not vice versa. And of course such misconfiguration in logging should not leads to RecursionError in runtime
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on issue #17336: Celery worker fails with RecursionError when asctime is in log_format but is not in json_fields
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #17336:
URL: https://github.com/apache/airflow/issues/17336#issuecomment-1025011727
This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] eladkal commented on issue #17336: Celery worker fails with RecursionError when asctime is in log_format but is not in json_fields
Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #17336:
URL: https://github.com/apache/airflow/issues/17336#issuecomment-916768827
Correct me if i'm wrong but if `json_format=True` and `asctime` isn't in `json_fields` then you can't reference `asctime`
in `log_format`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #17336: Celery worker fails with RecursionError when asctime is in log_format but is not in json_fields
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #17336:
URL: https://github.com/apache/airflow/issues/17336#issuecomment-889778690
Thanks for opening your first issue here! Be sure to follow the issue template!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org