You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/05 04:29:42 UTC

[GitHub] [airflow] josh-fell opened a new pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

josh-fell opened a new pull request #19419:
URL: https://github.com/apache/airflow/pull/19419


   The current `restrict-start_date` does not catch defining `start_date` in `default_args` of example DAGs like `default_args=dict(start_date=dates.days_ago(1))` as seen in the Google provider.
   
   This PR updates the pre-commit hook's regex to catch patterns like the above and others.
   
   For example, the new regex will capture:
   - `default_args=dict(owner=airflow, retries=1, start_date=`
   - `default_args=dict( owner: airflow, start_date =:`
   - `default_args={'owner': 'airflow', 'start_date' :`
   - `default_args={'owner': 'airflow','retries': 1, 'start_date':`
   - `default_args=dict(owner= "airflow", start_date=`
   - `default_args=dict( owner= "airflow",start_date:`
   - `default_args={'owner': 'airflow','start_date' :`
   - `default_args=dict(start_date=dates.days_ago(1)),`
   - `default_args=dict( start_date=dates.days_ago(1)),`
   - `default_args=dict(start_date =dates.days_ago(1)),`
   - `default_args=dict( start_date =dates.days_ago(1)),`
   - `default_args={'start_date':`
   - `default_args = {'start_date':`
   - `default_args={ 'start_date':`
   - `default_args = { 'start_date':`
   - `default_args={"start_date":`
   - `default_args = {"start_date":`
   - `default_args={ "start_date":`
   - `default_args = { "start_date":`
   - `default_args={'start_date' :`
   - `default_args = {'start_date' :`
   - `default_args={ 'start_date' :`
   - `default_args = { 'start_date' :`
   - `default_args={ "start_date" :`
   - `default_args = { "start_date" :`
   
   But **not** these patterns (as expected):
   - `default_args=dict('owner': 'airflow', _start_date=`
   - `default_args=dict( 'owner': 'airflow', my_start_date=`
   - `default_args=dict(owner='airflow', my_start_date=`
   - `default_args=dict( 'owner': 'airflow', my_start_date=`
   - `default_args={'owner': 'airflow', my_start_date=`
   - `default_args={"start_date " :`
   - `default_args = {"start_date " :`
   - `default_args={'my_start_date':`
   - `default_args = {'ystart_date':`
   - `default_args={ '_start_date':`
   - `default_args = { 'oistart_date':`
   - `default_args = {"estart_date":`
   - `default_args={ ",start_dates":`
   - `default_args = { "|start_date":`
   - `default_args={' start_dates' :`
   - `default_args = {'67start_date' :`
   - `default_args={" start_date" :`
   - `default_args = {" start_date" :`
   - `default_args={ "start_datee" :`
   - `default_args = { "start_date " :`
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961705849


   Should we stop using regexp for that one?
   
    The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Why don't we ast.parse() the example Dags and walk the tree and find default_args  and start_date ? That sounds like 20 lines of python code that will handle REALLY all cases. (and it has a bonus - we will also validate that example dags are parseable Python files). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
josh-fell commented on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961869100


   >The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Another daily, wisdom nugget presented by @potiuk.
   
   I like the ast tree idea. In #19237 we briefly discussed adding a pre-commit hook to validate `catchup=False` was set in example DAGs. Seems like this ast approach would yield a hook that kills two birds with one stone too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell closed pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
josh-fell closed pull request #19419:
URL: https://github.com/apache/airflow/pull/19419


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell edited a comment on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
josh-fell edited a comment on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961869100


   >The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Another daily, wisdom nugget presented by @potiuk.
   
   I like the ast tree idea. In #19237 we briefly discussed adding a pre-commit hook to validate `catchup=False` was set in example DAGs. Seems like this ast approach would yield a hook that kills two birds with one stone too as well as other validations in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell edited a comment on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
josh-fell edited a comment on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961869100


   >The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Another daily, wisdom nugget presented by @potiuk.
   
   I like the ast tree idea. In #19237 we briefly discussed adding a pre-commit hook to validate `catchup=False` was set in example DAGs. Seems like this ast approach would yield a hook that kills two birds with one stone too as well as other validations in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961705849


   Should we stop using regexp for that one?
   
    The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Why don't we ast.parse() the example Dags and walk the tree and find default_args  and start_date ? That sounds like 20 lines of python code that will handle REALLY all cases. (and it has a bonus - we will also validate that example dags are parseable Python files). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
josh-fell commented on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961869100


   >The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Another daily, wisdom nugget presented by @potiuk.
   
   I like the ast tree idea. In #19237 we briefly discussed adding a pre-commit hook to validate `catchup=False` was set in example DAGs. Seems like this ast approach would yield a hook that kills two birds with one stone too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961705849


   Should we stop using regexp for that one?
   
    The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Why don't we ast.parse() the example Dags and walk the tree and find default_args  and start_date ? That sounds like 20 lines of python code that will handle REALLY all cases. (and it has a bonus - we will also validate that example dags are parseable Python files). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961705849


   Should we stop using regexp for that one. The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Why don't we ast.parse() the example Dags and walk the tree and find default_args  and start_date ? That sounds like 20 lines of python code that will handle REALLY all cases. (and it has a bonus - we will also validate that example dags are parseable Python files). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961705849


   Should we stop using regexp for that one. The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Why don't we ast.parse() the example Dags and walk the tree and find default_args  and start_date ? That sounds like 20 lines of python code that will handle REALLY all cases. (and it has a bonus - we will also validate that example dags are parseable Python files). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961705849


   Should we stop using regexp for that one. The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Why don't we ast.parse() the example Dags and walk the tree and find default_args  and start_date ? That sounds like 20 lines of python code that will handle REALLY all cases. (and it has a bonus - we will also validate that example dags are parseable Python files). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
josh-fell commented on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961869100


   >The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Another daily, wisdom nugget presented by @potiuk.
   
   I like the ast tree idea. In #19237 we briefly discussed adding a pre-commit hook to validate `catchup=False` was set in example DAGs. Seems like this ast approach would yield a hook that kills two birds with one stone too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell edited a comment on pull request #19419: Handle more patterns in `restrict-start_date` pre-commit hook

Posted by GitBox <gi...@apache.org>.
josh-fell edited a comment on pull request #19419:
URL: https://github.com/apache/airflow/pull/19419#issuecomment-961869100


   >The old Chinese proverb says: "If you have a problem, introduce regexp - you will have two problems". This regexp will be impossible to reason about, fix and improve by anyone who will add a different pattern in the future.
   
   Another daily, wisdom nugget presented by @potiuk.
   
   I like the ast tree idea. In #19237 we briefly discussed adding a pre-commit hook to validate `catchup=False` was set in example DAGs. Seems like this ast approach would yield a hook that kills two birds with one stone too as well as other validations in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org