You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/01/10 11:02:00 UTC
[jira] [Commented] (AIRFLOW-6529) Serialization error occurs when
the scheduler tries to run on macOS.
[ https://issues.apache.org/jira/browse/AIRFLOW-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012711#comment-17012711 ]
ASF GitHub Bot commented on AIRFLOW-6529:
-----------------------------------------
sarutak commented on pull request #7128: [AIRFLOW-6529] Serialization error occurs when the scheduler tries to run on macOS.
URL: https://github.com/apache/airflow/pull/7128
When we try to run the scheduler on macOS, we will get a serialization error like as follows.
```
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
[2020-01-10 19:54:41,974] {executor_loader.py:59} INFO - Using executor SequentialExecutor
[2020-01-10 19:54:41,983] {scheduler_job.py:1462} INFO - Starting the scheduler
[2020-01-10 19:54:41,984] {scheduler_job.py:1469} INFO - Processing each file at most -1 times
[2020-01-10 19:54:41,984] {scheduler_job.py:1472} INFO - Searching for files in /Users/sarutak/airflow/dags
[2020-01-10 19:54:42,025] {scheduler_job.py:1474} INFO - There are 27 files in /Users/sarutak/airflow/dags
[2020-01-10 19:54:42,025] {scheduler_job.py:1527} INFO - Resetting orphaned tasks for active dag runs
[2020-01-10 19:54:42,059] {scheduler_job.py:1500} ERROR - Exception when executing execute_helper
Traceback (most recent call last):
File "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1498, in _execute
self._execute_helper()
File "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1531, in _execute_helper
self.processor_agent.start()
File "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/utils/dag_processing.py", line 348, in start
self._process.start()
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
return Popen(process_obj)
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'SchedulerJob._execute.<locals>.processor_factory'
```
The reason is scheduler try to run subprocesses using multiprocessing with spawn mode.
Actually, as of Python 3.8, spawn mode is the default mode in macOS.
---
Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg)
- [x] Description above provides context of the change
- [x] Commit message/PR title starts with `[AIRFLOW-NNNN]`. AIRFLOW-NNNN = JIRA ID<sup>*</sup>
- [x] Unit tests coverage for changes (not needed for documentation changes)
- [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
- [x] Relevant documentation is updated including usage instructions.
- [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
<sup>*</sup> For document-only changes commit message can start with `[AIRFLOW-XXXX]`.
---
In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
> Serialization error occurs when the scheduler tries to run on macOS.
> --------------------------------------------------------------------
>
> Key: AIRFLOW-6529
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6529
> Project: Apache Airflow
> Issue Type: Bug
> Components: scheduler
> Affects Versions: 1.10.8
> Environment: macOS
> Python 3.8
> multiprocessing with spawn mode
> Reporter: Kousuke Saruta
> Assignee: Kousuke Saruta
> Priority: Major
>
> When we try to run the scheduler on macOS, we will get a serialization error like as follows.
> {code}
> ____________ _____________
> ____ |__( )_________ __/__ /________ __
> ____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
> ___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
> _/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
> [2020-01-10 19:54:41,974] {executor_loader.py:59} INFO - Using executor SequentialExecutor
> [2020-01-10 19:54:41,983] {scheduler_job.py:1462} INFO - Starting the scheduler
> [2020-01-10 19:54:41,984] {scheduler_job.py:1469} INFO - Processing each file at most -1 times
> [2020-01-10 19:54:41,984] {scheduler_job.py:1472} INFO - Searching for files in /Users/sarutak/airflow/dags
> [2020-01-10 19:54:42,025] {scheduler_job.py:1474} INFO - There are 27 files in /Users/sarutak/airflow/dags
> [2020-01-10 19:54:42,025] {scheduler_job.py:1527} INFO - Resetting orphaned tasks for active dag runs
> [2020-01-10 19:54:42,059] {scheduler_job.py:1500} ERROR - Exception when executing execute_helper
> Traceback (most recent call last):
> File "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1498, in _execute
> self._execute_helper()
> File "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1531, in _execute_helper
> self.processor_agent.start()
> File "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/utils/dag_processing.py", line 348, in start
> self._process.start()
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/process.py", line 121, in start
> self._popen = self._Popen(self)
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
> return _default_context.get_context().Process._Popen(process_obj)
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
> return Popen(process_obj)
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
> super().__init__(process_obj)
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
> self._launch(process_obj)
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
> reduction.dump(process_obj, fp)
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
> ForkingPickler(file, protocol).dump(obj)
> AttributeError: Can't pickle local object 'SchedulerJob._execute.<locals>.processor_factory'
> {code}
> The reason is scheduler try to run subprocesses using multiprocessing with spawn mode.
> Actually, as of Python 3.8, spawn mode is the default mode in macOS.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)