You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/11 18:46:37 UTC

[GitHub] [airflow] rchangj opened a new issue #17558: New job fail immediately after start, no logs

rchangj opened a new issue #17558:
URL: https://github.com/apache/airflow/issues/17558


   **Apache Airflow version**:1.10.12
   
   **Apache Airflow Provider versions** (please include all providers that are relevant to your bug):
   No other providers
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   No using k8s
   
   **Environment**: 
   AWS Linux
   - **Cloud provider or hardware configuration**:AWS
   - **OS** (e.g. from /etc/os-release):
   -NAME="Amazon Linux"
   VERSION="2"
   ID="amzn"
   ID_LIKE="centos rhel fedora"
   - **Kernel** (e.g. `uname -a`): 4.14.186-146.268.amzn2.x86_64 #1 SMP Tue Jul 14 18:16:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
   - **Install tools**: yum, pip3
   - **Others**:
   
   **What happened**:
   I have been running airflow 1.10.12 for several months without any issues. Recently I updated airflow to use SSL cert. After that I reboot the airflow server and then restarted the airflow scheduler and Webserver as following:
   airflow Webserver &
   airflow scheduler &
   
   The jobs run normally at beginning. However, within one day, all the new jobs will immediately fail. From UI, I saw the task status as failed. However there is no log under the task when I checked the log from UI.
    
   I also checked the /logs/scheduler directory, except a warning of "psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "variable_key_key", which has existed for long time, I didn't see any other warning or error. 
   
   Now this problem happened every day, i.e., if I restart airflow Webserver and scheduler, it works well. But within one day, all the new jobs will immediately fail with no log.
   
   Do you have any insight what can go wrong? Thanks.
   
   
   
   **What you expected to happen**:
   
   **How to reproduce it**:
   Not exactly sure how to reproduce. I just reboot airflow server.
   
   **Anything else we need to know**:
   
   
   
   How often does this problem occur? Once? Every time etc?
   I noticed the problem happens within 1 day.
   
   Any relevant logs to include? Put them here in side a detail tag:
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-917581128


   > bumped into the same issue, `DummyOperator` failed immediately with any log
   
   There isn't enough information to understand what is the issue.
   Please add reproduce steps.
   I would suggest to file a new issue as this one is on Airflow 1.10.12 which is EOL and author reported the issue has gone away when upgraded to Airflow 2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rchangj commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
rchangj commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-898195873


   Hi, 
   I updated to the latest airflow 2.1.2. I got this error:
   Python version: 3.7.10
   Airflow version: 2.1.2
   Node: ip-10-77-52-33.vpc.internal
   -------------------------------------------------------------------------------
   Traceback (most recent call last):
     File "/home/ec2-user/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/home/ec2-user/.local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
       cursor.execute(statement, parameters)
   psycopg2.errors.UndefinedTable: relation "dag" does not exist
   LINE 2: FROM dag
   
   Following the recommendation from this post: https://forums.aws.amazon.com/thread.jspa?threadID=341148 and default airflow.providers.amazon.aws.secrets.systems_manager.SystemsManagerParameterStoreBackend. The problem still exists. 
   
   Any insight what can go wrong? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rchangj commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
rchangj commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-942945813


   This issue shows on version 2.1.2 as well. There is no logs of any error shows up. As I mentioned earlier, following command "sudo /sbin/runuser -l -c "airflow scheduler" can be a work around. It would be good to have a fix for this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] david30907d commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
david30907d commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-917540756


   bumped into the same issue, `DummyOperator` failed immediately with any log


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-897172899


   It's really, really hard to say without more logs, You might want to try to find more logs. Also It would be great if you explain what kind of SSL configuration you've made (why?) and possibly if you could revert that change - even in a test system) and see if that fixes the problem. I do not expect this, maybe you also upgraded some libraries etc. along the way.
   
   Just one note - it's not very likely you will get a lot of help with 1.10.12. Airflow 1.10 reached End-Of-Life on June 17th 2021 and It will not receive any more fixes - not even the critical ones, so you should upgrade to Airflow 2 as soon as you can. Maybe this is the right time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rchangj commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
rchangj commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-898811991


   After I use following command to start airflow, the issue goes away. It seems there is a corner issue due to permission:
   sudo /sbin/runuser -l <user> -c "airflow scheduler"
   
   Meantime, I have airflow 2.1.2 installed and running the scheduler, finger crossed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #17558:
URL: https://github.com/apache/airflow/issues/17558


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-974729835


   This issue has been closed because it has not received response from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-974729835


   This issue has been closed because it has not received response from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal edited a comment on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
eladkal edited a comment on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-917581128


   > bumped into the same issue, `DummyOperator` failed immediately with any log
   
   There isn't enough information to understand what is the issue.
   Please add reproduce steps.
   I would suggest to file a new issue as this one is on Airflow 1.10.12 which is EOL and author reported the issue has been resolved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rchangj commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
rchangj commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-898201909


   Providers info:
   apache-airflow-providers-amazon==2.1.0
   apache-airflow-providers-apache-hdfs==2.0.0
   apache-airflow-providers-ftp==2.0.0
   apache-airflow-providers-imap==2.0.0
   apache-airflow-providers-postgres==2.0.0
   apache-airflow-providers-slack==4.0.0
   apache-airflow-providers-sqlite==2.0.0
   apache-airflow-providers-ssh==2.1.0
   
   Postgresql version:
   postgresql-server-13.3-2.amzn2.0.1.x86_64
   postgresql-13.3-2.amzn2.0.1.x86_64
   postgresql-contrib-13.3-2.amzn2.0.1.x86_64


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-968178506


   This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #17558:
URL: https://github.com/apache/airflow/issues/17558


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rchangj commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
rchangj commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-897213964


   The SSL certs are just self signed certs created as following.
       openssl genrsa -out private.pem 2048
       openssl req -new -x509 -key private.pem -out cacert.pem -days 1095
   The airflow.cfg changes are as followings:
      web_server_ssl_cert = path/to/cacert.pem
      web_server_ssl_key = path/to/private.pem
   
   There are ABSOLUTELY no other changes nor any other lib installation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] david30907d commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
david30907d commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-917952391


   > > bumped into the same issue, `DummyOperator` failed immediately with any log
   > 
   > There isn't enough information to understand what is the issue.
   > Please add reproduce steps.
   > I would suggest to file a new issue as this one is on Airflow 1.10.12 which is EOL and author reported the issue has been resolved.
   
   Hi @eladkal any suggestions to get the log? I'm using 2.1.3 and have no idea how to provide more details


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rchangj removed a comment on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
rchangj removed a comment on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-898195873






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-942810493


   This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #17558: New job fail immediately after start, no logs

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #17558:
URL: https://github.com/apache/airflow/issues/17558#issuecomment-897065208


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org