You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/05/15 16:53:46 UTC

[GitHub] [airflow] ArgentFalcon opened a new pull request #8879: [Airflow-8878] Add the ability for the webserver to sync dags to the DB

ArgentFalcon opened a new pull request #8879:
URL: https://github.com/apache/airflow/pull/8879


   This is related to https://github.com/apache/airflow/issues/8878
   The issue is that sync_to_db is only called from two places:
      The scheduler (which is not running)
      initdb (which runs once)
   
   Therefore, we need another way to update the  model in the DB, so a flag and sync should do it. I think that statement will also be reevaluated every worker restart time, so it should be pretty up to date
   
   If anyone has advice on how I should test this change, I would appreciate it. 
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] BasPH commented on pull request #8879: [8878] Add the ability for the webserver to sync dags to the DB

Posted by GitBox <gi...@apache.org>.
BasPH commented on pull request #8879:
URL: https://github.com/apache/airflow/pull/8879#issuecomment-629524294


   This seems like a change oddly specific to your installation.
   
   Ideally I'd work towards some process dedicated to serializing DAGs into the database, and have the webserver _only_ read metadata from the database, and nothing else. Please elaborate why this would be needed?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ArgentFalcon closed pull request #8879: [8878] Add the ability for the webserver to sync dags to the DB

Posted by GitBox <gi...@apache.org>.
ArgentFalcon closed pull request #8879:
URL: https://github.com/apache/airflow/pull/8879


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ArgentFalcon commented on pull request #8879: [8878] Add the ability for the webserver to sync dags to the DB

Posted by GitBox <gi...@apache.org>.
ArgentFalcon commented on pull request #8879:
URL: https://github.com/apache/airflow/pull/8879#issuecomment-629663447


   @mik-laj the command is something I've thought about as well, but I was wondering if that might be more heavyduty than this deserves. 
   
   I'm not sure how this will cause more libraries to be loaded though. The dagbag is still loaded in the webserver like always, All this does is sync it to the DB, which just adds some SQL calls (which granted, is overhead, but that's why I added a flag)
   
   @BasPH This is general because it affects any Airflow webserver running without a scheduler, which other people could do. It is a niche use case though. 
   
   I can move it to a CLI command though
    


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #8879: [8878] Add the ability for the webserver to sync dags to the DB

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8879:
URL: https://github.com/apache/airflow/pull/8879#issuecomment-629593316


   This change will cause a lot of libraries to be loaded into the webserver process that are unnecessary and will never be used.  If we need a method to force synchronization, I think we can add it as a separate command. Then your script will run two commands one after the other - `airflow dags sync_to_db && airflow webserver` 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org