You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Jarek Potiuk (Jira)" <ji...@apache.org> on 2019/10/30 13:18:00 UTC

[jira] [Commented] (AIRFLOW-4496) Airflow `backfill` fails on pickle thread error when --task_regex used

    [ https://issues.apache.org/jira/browse/AIRFLOW-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963010#comment-16963010 ] 

Jarek Potiuk commented on AIRFLOW-4496:
---------------------------------------

Hello [~trevorpburke] - do you still have this problem? I believe it has something to do with the operator you are using in the Dag. Can you share the DAG structure you have?

> Airflow `backfill` fails on pickle thread error when --task_regex used
> ----------------------------------------------------------------------
>
>                 Key: AIRFLOW-4496
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4496
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DAG
>    Affects Versions: 1.10.2
>         Environment: Ubuntu 16.04.5 LTS (GNU/Linux 4.4.0-1075-aws x86_64)
> Postgres LocalExecutor
>            Reporter: Trevor Burke
>            Priority: Major
>
> Airflow backfill works properly when used without task_regex, but when I employ that flog I get the following stack trace:
> {code:java}
> TypeError: can't pickle _thread.RLock objects
> {code}
> The command I'm using is:
> {code:java}
> airflow backfill <dag_id>  -s 2019-04-15 -e 2019-05-08 -x -t normalize -i --reset_dagruns
> {code}
> {code:python}
> interval_args = {
>     'owner': 'airflow',
>     'depends_on_past': True,
>     'start_date': datetime(2019, 4, 15),
>     'retries': 2,
>     'retry_delay': timedelta(minutes=5),
>     'on_failure_callback': send_email
> }
> interval_dag = DAG('dag_id_redacted',
>                    default_args=interval_args,
>                    schedule_interval='*/15 * * * *',
>                    catchup=True,
>                    user_defined_macros=dict(DBT=DBT),)
> {code}
> The task flow is basically get data from external API, dump to S3, flatten for database loading, and load to database. The tasks have been performing perfectly fine and previous backfills have been successful, but task_regex has been giving me issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)