You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/28 14:51:05 UTC

[GitHub] [airflow] MatrixManAtYrService commented on issue #22561: Scheduler crashes: "Cannot unmap a deserialized operator"

MatrixManAtYrService commented on issue #22561:
URL: https://github.com/apache/airflow/issues/22561#issuecomment-1080753057


   @uranusjr  it does prevent the scheduler from crashing, but the task fails and I don't get any task logs from it (`./logs/dag_id=my_name_is` doesn't appear).
   
   Here are the scheduler logs:
   
   ```
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2022-03-28 08:40:44,195] {scheduler_job.py:712} INFO - Starting the scheduler
   [2022-03-28 08:40:44,196] {scheduler_job.py:717} INFO - Processing each file at most -1 times
   [2022-03-28 08:40:44,199] {executor_loader.py:106} INFO - Loaded executor: SequentialExecutor
   [2022-03-28 08:40:44,204] {manager.py:156} INFO - Launched DagFileProcessorManager with pid: 47735
   [2022-03-28 08:40:44,206] {scheduler_job.py:1242} INFO - Resetting orphaned tasks for active dag runs
   [2022-03-28 08:40:44,769] {settings.py:55} INFO - Configured default timezone Timezone('UTC')
   [2022-03-28 08:40:44,774] {manager.py:398} WARNING - Because we cannot use more than 1 thread (parsing_processes = 2)
   when using sqlite. So we set parallelism to 1.
   [2022-03-28 08:40:45 -0600] [47734] [INFO] Starting gunicorn 20.1.0
   [2022-03-28 08:40:45 -0600] [47734] [INFO] Listening at: http://0.0.0.0:8793 (47734)
   [2022-03-28 08:40:45 -0600] [47734] [INFO] Using worker: sync
   [2022-03-28 08:40:45 -0600] [47740] [INFO] Booting worker with pid: 47740
   [2022-03-28 08:40:45 -0600] [47741] [INFO] Booting worker with pid: 47741
   [2022-03-28 08:40:53,363] {dag.py:2930} INFO - Setting next_dagrun for my_name_is to 1999-12-25T00:00:00+00:00, run_af
   ter=2029-12-17T00:00:00+00:00
   [2022-03-28 08:40:53,397] {scheduler_job.py:354} INFO - 1 tasks up for execution:
           <TaskInstance: my_name_is.mynameis scheduled__1970-01-01T00:00:00+00:00 [scheduled]>
   [2022-03-28 08:40:53,397] {scheduler_job.py:376} INFO - Figuring out tasks to run in Pool(name=default_pool) with 128
   open slots and 1 task instances ready to be queued
   [2022-03-28 08:40:53,397] {scheduler_job.py:434} INFO - DAG my_name_is has 0/16 running and queued tasks
   [2022-03-28 08:40:53,398] {scheduler_job.py:520} INFO - Setting the following tasks to queued state:
           <TaskInstance: my_name_is.mynameis scheduled__1970-01-01T00:00:00+00:00 [scheduled]>
   [2022-03-28 08:40:53,400] {scheduler_job.py:562} INFO - Sending TaskInstanceKey(dag_id='my_name_is', task_id='mynameis
   ', run_id='scheduled__1970-01-01T00:00:00+00:00', try_number=1, map_index=-1) to executor with priority 1 and queue de
   fault
   [2022-03-28 08:40:53,401] {base_executor.py:88} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'my_name_is', 'myn
   ameis', 'scheduled__1970-01-01T00:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/my_name_is.py']
   [2022-03-28 08:40:53,403] {sequential_executor.py:59} INFO - Executing command: ['airflow', 'tasks', 'run', 'my_name_i
   s', 'mynameis', 'scheduled__1970-01-01T00:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/my_name_is.py']
   [2022-03-28 08:40:54,061] {dagbag.py:506} INFO - Filling up the DagBag from /Users/matt/2022/03/27/dags/my_name_is.py
   Traceback (most recent call last):
     File "/Users/matt/2022/03/27/venv2/bin/airflow", line 33, in <module>
       sys.exit(load_entry_point('apache-airflow', 'console_scripts', 'airflow')())
     File "/Users/matt/src/astronomer-airflow/airflow/__main__.py", line 38, in main
       args.func(args)
     File "/Users/matt/src/astronomer-airflow/airflow/cli/cli_parser.py", line 51, in command
       return func(*args, **kwargs)
     File "/Users/matt/src/astronomer-airflow/airflow/utils/cli.py", line 99, in wrapper
       return f(*args, **kwargs)
     File "/Users/matt/src/astronomer-airflow/airflow/cli/commands/task_command.py", line 362, in task_run
       ti, _ = _get_ti(task, args.execution_date_or_run_id, args.map_index)
     File "/Users/matt/src/astronomer-airflow/airflow/utils/session.py", line 71, in wrapper
       return func(*args, session=session, **kwargs)
     File "/Users/matt/src/astronomer-airflow/airflow/cli/commands/task_command.py", line 144, in _get_ti
       raise RuntimeError("No map_index passed to mapped task")
   RuntimeError: No map_index passed to mapped task
   [2022-03-28 08:40:54,233] {sequential_executor.py:66} ERROR - Failed to execute task Command '['airflow', 'tasks', 'ru
   n', 'my_name_is', 'mynameis', 'scheduled__1970-01-01T00:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/my_name_is.p
   y']' returned non-zero exit status 1..
   [2022-03-28 08:40:54,235] {scheduler_job.py:615} INFO - Executor reports execution of my_name_is.mynameis run_id=sched
   uled__1970-01-01T00:00:00+00:00 exited with status failed for try_number 1
   [2022-03-28 08:40:54,245] {scheduler_job.py:659} INFO - TaskInstance Finished: dag_id=my_name_is, task_id=mynameis, ru
   n_id=scheduled__1970-01-01T00:00:00+00:00, map_index=-1, run_start_date=None, run_end_date=None, run_duration=None, st
   ate=queued, executor_state=failed, try_number=1, max_tries=0, job_id=None, pool=default_pool, queue=default, priority_
   weight=1, operator=_PythonDecoratedOperator, queued_dttm=2022-03-28 14:40:53.398693+00:00, queued_by_job_id=1, pid=Non
   e
   [2022-03-28 08:40:54,246] {scheduler_job.py:688} ERROR - Executor reports task instance <TaskInstance: my_name_is.myna
   meis scheduled__1970-01-01T00:00:00+00:00 [queued]> finished (failed) although the task says its queued. (Info: None)
   Was the task killed externally?
   [2022-03-28 08:40:54,247] {taskinstance.py:1785} ERROR - Executor reports task instance <TaskInstance: my_name_is.myna
   meis scheduled__1970-01-01T00:00:00+00:00 [queued]> finished (failed) although the task says its queued. (Info: None)
   Was the task killed externally?
   [2022-03-28 08:40:54,255] {taskinstance.py:1277} INFO - Marking task as FAILED. dag_id=my_name_is, task_id=mynameis, e
   xecution_date=19700101T000000, start_date=, end_date=20220328T144054
   [2022-03-28 08:40:54,322] {dagrun.py:534} ERROR - Marking run <DagRun my_name_is @ 1970-01-01 00:00:00+00:00: schedule
   d__1970-01-01T00:00:00+00:00, externally triggered: False> failed
   [2022-03-28 08:40:54,323] {dagrun.py:594} INFO - DagRun Finished: dag_id=my_name_is, execution_date=1970-01-01 00:00:0
   0+00:00, run_id=scheduled__1970-01-01T00:00:00+00:00, run_start_date=2022-03-28 14:40:53.370452+00:00, run_end_date=20
   22-03-28 14:40:54.323434+00:00, run_duration=0.952982, state=failed, external_trigger=False, run_type=scheduled, data_
   interval_start=1970-01-01 00:00:00+00:00, data_interval_end=1999-12-25 00:00:00+00:00, dag_hash=fb9f4777fcb737e19d3401
   99f9950b05
   [2022-03-28 08:40:54,325] {dagrun.py:774} WARNING - Failed to record first_task_scheduling_delay metric:
   list index out of range
   [2022-03-28 08:40:54,326] {dag.py:2930} INFO - Setting next_dagrun for my_name_is to 1999-12-25T00:00:00+00:00, run_af
   ter=2029-12-17T00:00:00+00:00
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org