You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "t oo (Jira)" <ji...@apache.org> on 2019/09/04 23:06:00 UTC
[jira] [Updated] (AIRFLOW-5355) 1.10.4 upgrade issues - No module
named kubernetes (but i'm using localexecutor)
[ https://issues.apache.org/jira/browse/AIRFLOW-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
t oo updated AIRFLOW-5355:
--------------------------
Description:
i upgraded from 1.10.3 to 1.10.4 just now but in the main page of the ui, if i click refresh button next to my dag it shows error 'Broken DAG: [x.py] No module named kubernetes'
I have debug log enabled but have no idea why it is trying to find kubernetes.
my install steps:
pip install cryptography mysqlclient ldap3 gunicorn[gevent]
pip install kubernetes
pip install apache-airflow-1.10.4-bin.tar.gz
pip install apache-airflow-1.10.4-bin.tar.gz[kubernetes]
airflow initdb
airflow upgradedb
my only dag has:
import datetime as dt
import glob
import json
import logging
import os
import subprocess
import re
from airflow import DAG
from airflow.contrib.operators.spark_submit_operator import SparkSubmitOperator
from airflow.operators.python_operator import PythonOperator
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import BranchPythonOperator
from airflow.hooks.base_hook import BaseHook
airflow 1.10.4, localexecutor, python2, t3.large EC2, spark standalone scheduler
it worked fine in 1.10.3
some things that may be relevant:
# when i query dag table in mysql metastore db table, some dags have {{last_expired populated with a timestamp but some have }}{{last_expired}}{{ that is empty. }}
# {{i am doing blue-green deploy so i have one ec2 running 1.10.3 and one ec2 running 1.10.4 but both ec2s are talking to a common mysql metastore}}
UPDATE As an awful workaround i commented out all references to kube*/pod in many .py files from site-packages!
some other small issues:
a) tutorial.py dag is in the ui, how to remove? when i click delete it says tutorial.py not found UPDATE: delete dag from cli removed this
b) [2019-08-30 04:12:57,714] \{scheduler_job.py:924} WARNING - Tasks using non-existent pool '' will not be scheduled is in logs. UPDATE: due to pool=None in spark_submit, once that was removed it could run.
c) this is in logs: UPDATE - [https://github.com/apache/airflow/pull/5330#issuecomment-526919369] mentions expected
airflow-scheduler.log-[2019-08-30 09:05:38,451] \{settings.py:327} DEBUG - Failed to import airflow_local_settings.
airflow-scheduler.log-Traceback (most recent call last):
airflow-scheduler.log- File "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/settings.py", line 315, in import_local_settings
airflow-scheduler.log- import airflow_local_settings
airflow-scheduler.log:ImportError: No module named airflow_local_settings
airflow-scheduler.log-[2019-08-30 09:05:38,452] \{logging_config.py:59} DEBUG - Unable to load custom logging, using default config instead
was:
i upgraded from 1.10.3 to 1.10.4 just now but in the main page of the ui, if i click refresh button next to my dag it shows error 'Broken DAG: [x.py] No module named kubernetes'
I have debug log enabled but have no idea why it is trying to find kubernetes.
my install steps:
pip install cryptography mysqlclient ldap3 gunicorn[gevent]
pip install kubernetes
pip install apache-airflow-1.10.4-bin.tar.gz
pip install apache-airflow-1.10.4-bin.tar.gz[kubernetes]
airflow initdb
airflow upgradedb
my only dag has:
import datetime as dt
import glob
import json
import logging
import os
import subprocess
import re
from airflow import DAG
from airflow.contrib.operators.spark_submit_operator import SparkSubmitOperator
from airflow.operators.python_operator import PythonOperator
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import BranchPythonOperator
from airflow.hooks.base_hook import BaseHook
airflow 1.10.4, localexecutor, python2, t3.large EC2, spark standalone scheduler
it worked fine in 1.10.3
some things that may be relevant:
# when i query dag table in mysql metastore db table, some dags have {{last_expired populated with a timestamp but some have }}{{last_expired}}{{ that is empty. }}
# {{i am doing blue-green deploy so i have one ec2 running 1.10.3 and one ec2 running 1.10.4 but both ec2s are talking to a common mysql metastore}}
some other small issues:
a) tutorial.py dag is in the ui, how to remove? when i click delete it says tutorial.py not found
b) [2019-08-30 04:12:57,714] \{scheduler_job.py:924} WARNING - Tasks using non-existent pool '' will not be scheduled is in logs
c) this is in logs:
airflow-scheduler.log-[2019-08-30 09:05:38,451] \{settings.py:327} DEBUG - Failed to import airflow_local_settings.
airflow-scheduler.log-Traceback (most recent call last):
airflow-scheduler.log- File "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/settings.py", line 315, in import_local_settings
airflow-scheduler.log- import airflow_local_settings
airflow-scheduler.log:ImportError: No module named airflow_local_settings
airflow-scheduler.log-[2019-08-30 09:05:38,452] \{logging_config.py:59} DEBUG - Unable to load custom logging, using default config instead
> 1.10.4 upgrade issues - No module named kubernetes (but i'm using localexecutor)
> --------------------------------------------------------------------------------
>
> Key: AIRFLOW-5355
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5355
> Project: Apache Airflow
> Issue Type: Bug
> Components: ui
> Affects Versions: 1.10.4
> Reporter: t oo
> Priority: Major
>
> i upgraded from 1.10.3 to 1.10.4 just now but in the main page of the ui, if i click refresh button next to my dag it shows error 'Broken DAG: [x.py] No module named kubernetes'
>
> I have debug log enabled but have no idea why it is trying to find kubernetes.
>
>
> my install steps:
> pip install cryptography mysqlclient ldap3 gunicorn[gevent]
> pip install kubernetes
> pip install apache-airflow-1.10.4-bin.tar.gz
> pip install apache-airflow-1.10.4-bin.tar.gz[kubernetes]
> airflow initdb
> airflow upgradedb
>
> my only dag has:
>
> import datetime as dt
> import glob
> import json
> import logging
> import os
> import subprocess
> import re
> from airflow import DAG
> from airflow.contrib.operators.spark_submit_operator import SparkSubmitOperator
> from airflow.operators.python_operator import PythonOperator
> from airflow.operators.dummy_operator import DummyOperator
> from airflow.operators.python_operator import BranchPythonOperator
> from airflow.hooks.base_hook import BaseHook
>
>
> airflow 1.10.4, localexecutor, python2, t3.large EC2, spark standalone scheduler
>
> it worked fine in 1.10.3
>
> some things that may be relevant:
> # when i query dag table in mysql metastore db table, some dags have {{last_expired populated with a timestamp but some have }}{{last_expired}}{{ that is empty. }}
> # {{i am doing blue-green deploy so i have one ec2 running 1.10.3 and one ec2 running 1.10.4 but both ec2s are talking to a common mysql metastore}}
>
> UPDATE As an awful workaround i commented out all references to kube*/pod in many .py files from site-packages!
>
>
> some other small issues:
> a) tutorial.py dag is in the ui, how to remove? when i click delete it says tutorial.py not found UPDATE: delete dag from cli removed this
> b) [2019-08-30 04:12:57,714] \{scheduler_job.py:924} WARNING - Tasks using non-existent pool '' will not be scheduled is in logs. UPDATE: due to pool=None in spark_submit, once that was removed it could run.
> c) this is in logs: UPDATE - [https://github.com/apache/airflow/pull/5330#issuecomment-526919369] mentions expected
> airflow-scheduler.log-[2019-08-30 09:05:38,451] \{settings.py:327} DEBUG - Failed to import airflow_local_settings.
> airflow-scheduler.log-Traceback (most recent call last):
> airflow-scheduler.log- File "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/settings.py", line 315, in import_local_settings
> airflow-scheduler.log- import airflow_local_settings
> airflow-scheduler.log:ImportError: No module named airflow_local_settings
> airflow-scheduler.log-[2019-08-30 09:05:38,452] \{logging_config.py:59} DEBUG - Unable to load custom logging, using default config instead
--
This message was sent by Atlassian Jira
(v8.3.2#803003)