You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/11 08:29:42 UTC

[GitHub] [airflow] ephraimbuddy opened a new pull request #20802: Only validate Params when DAG is triggered

ephraimbuddy opened a new pull request #20802:
URL: https://github.com/apache/airflow/pull/20802


   The current validation that happens when viewing the dag is raising
   errors and preventing users from viewing the DAG. This PR moves validation
   to the resolve method which runs validation when a dag is triggered with params
   
   Closes: https://github.com/apache/airflow/issues/20559
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #20802: Only validate Params when DAG is triggered

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #20802:
URL: https://github.com/apache/airflow/pull/20802#issuecomment-1013285414


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr merged pull request #20802: Only validate Params when DAG is triggered

Posted by GitBox <gi...@apache.org>.
uranusjr merged pull request #20802:
URL: https://github.com/apache/airflow/pull/20802


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on pull request #20802: Only validate Params when DAG is triggered

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on pull request #20802:
URL: https://github.com/apache/airflow/pull/20802#issuecomment-1009723363


   > @ephraimbuddy what I believe that this check prevents the use of an invalid param value in a dag & errors out during the scheduler parsing itself. With this change, I guess a user would be able to use invalid params & the scheduler would try to execute it as well?
   
   Got it. Let me think of a better way to solve this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on pull request #20802: Only validate Params when DAG is triggered

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on pull request #20802:
URL: https://github.com/apache/airflow/pull/20802#issuecomment-1009798510


   @msumit, I have moved the validation to dagbag and I think it's better. Now, we get import errors when the params are invalid


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] msumit commented on pull request #20802: Only validate Params when DAG is triggered

Posted by GitBox <gi...@apache.org>.
msumit commented on pull request #20802:
URL: https://github.com/apache/airflow/pull/20802#issuecomment-1009716557


   @ephraimbuddy what I believe that this check prevents the use of an invalid param value in a dag & errors out during the scheduler parsing itself. With this change, I guess a user would be able to use invalid params & the scheduler would try to execute it as well?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on a change in pull request #20802: Only validate Params when DAG is triggered

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on a change in pull request #20802:
URL: https://github.com/apache/airflow/pull/20802#discussion_r785422048



##########
File path: airflow/models/dagbag.py
##########
@@ -398,6 +400,9 @@ def _process_modules(self, filepath, mods, file_last_changed_on_disk):
             dag.fileloc = mod.__file__
             try:
                 dag.timetable.validate()
+                # create a copy of params before validating

Review comment:
       We can actually validate directly without copy since we are not updating the params. Will update it
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #20802: Only validate Params when DAG is triggered

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #20802:
URL: https://github.com/apache/airflow/pull/20802#discussion_r782132542



##########
File path: airflow/models/dagbag.py
##########
@@ -398,6 +400,10 @@ def _process_modules(self, filepath, mods, file_last_changed_on_disk):
             dag.fileloc = mod.__file__
             try:
                 dag.timetable.validate()
+                # create a copy of params before validating
+                copied_params = copy.deepcopy(dag.params)
+                copied_params.update(conf or {})

Review comment:
       @ephraimbuddy has this been fixed? From what I can tell `conf` is still not defined correctly, I may be missing something here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on pull request #20802: Only validate Params when DAG is triggered

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on pull request #20802:
URL: https://github.com/apache/airflow/pull/20802#issuecomment-1009966725


   ![importerror](https://user-images.githubusercontent.com/4122866/148951265-9b4c8996-e9ec-4a98-851a-1255efc16f0c.png)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jedcunningham commented on a change in pull request #20802: Only validate Params when DAG is triggered

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on a change in pull request #20802:
URL: https://github.com/apache/airflow/pull/20802#discussion_r784989929



##########
File path: airflow/models/dagbag.py
##########
@@ -398,6 +400,9 @@ def _process_modules(self, filepath, mods, file_last_changed_on_disk):
             dag.fileloc = mod.__file__
             try:
                 dag.timetable.validate()
+                # create a copy of params before validating

Review comment:
       Why though? We should add that to this comment or remove it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] msumit commented on a change in pull request #20802: Only validate Params when DAG is triggered

Posted by GitBox <gi...@apache.org>.
msumit commented on a change in pull request #20802:
URL: https://github.com/apache/airflow/pull/20802#discussion_r782010936



##########
File path: airflow/models/dagbag.py
##########
@@ -398,6 +400,10 @@ def _process_modules(self, filepath, mods, file_last_changed_on_disk):
             dag.fileloc = mod.__file__
             try:
                 dag.timetable.validate()
+                # create a copy of params before validating
+                copied_params = copy.deepcopy(dag.params)
+                copied_params.update(conf or {})

Review comment:
       what is the value of `conf` here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on a change in pull request #20802: Only validate Params when DAG is triggered

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on a change in pull request #20802:
URL: https://github.com/apache/airflow/pull/20802#discussion_r782017751



##########
File path: airflow/models/dagbag.py
##########
@@ -398,6 +400,10 @@ def _process_modules(self, filepath, mods, file_last_changed_on_disk):
             dag.fileloc = mod.__file__
             try:
                 dag.timetable.validate()
+                # create a copy of params before validating
+                copied_params = copy.deepcopy(dag.params)
+                copied_params.update(conf or {})

Review comment:
       Oops...didn't think much about that. Wrong here. I see that validating at create_dagrun is necessary too. I will add it back but then, since we validate during parsing, the check at create_dagrun would check for error when conf is added manually.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org