You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/23 07:59:59 UTC

[GitHub] [airflow] jeffolsi opened a new issue #8525: SQLBranchOperator

jeffolsi opened a new issue #8525:
URL: https://github.com/apache/airflow/issues/8525


   Airflow has SQLSensor and PythonBranchOperator it seems that the logic of both can be combined to create SQLBranchOperator
   
   SQLSenor knows to take single sql query and wait for condition on it. it can be copied and changed so that it won't wait (like sensor) but simply return true false. According to the value returned from the query it will follow the chosen branch.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu edited a comment on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629226528


   Thank you @jeffolsi . I see. I guess what you are looking for is sightly different than the Python version. Yes, the SQL query can decide True/False but we still need to let the operator knows which "branch" or path you want to follow right?  How about the following?
   
   ``` Python
   BranchSqlOperator(
       conn_id: str,
       sql: str, 
       parameters: Optional, 
       follow_tasks_if_true: list, # list of task ids to follow if SQL return True,
       follow_tasks_if_false: list, # list of task ids to follow if SQL return False
   )
   ```
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637638996


   Hi @mik-laj , can you help and close this item? Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-634136893


   Hi @potiuk , can you help and comment on a question raised by @eladkal in this PR? https://github.com/apache/airflow/pull/8942 
   
   I believe this PR is ready to go otherwise. 
   
   Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-622334540


   Please do !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628197562


   Thank you both! 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628236594


   Thanks @pujaji  , sounds like you've got this. I will circle back in a few days. Can't wait to see this new operator!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628239493


   Hi @pujaji , I am sorry I misunderstood you. Sounds good. I will give this a try!  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628195754


   Yep. Thanks @jeffolsi for the explanation. Yep. I think this is about right.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu edited a comment on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637638996


   Hi @mik-laj , can you help and close this item? This operator is merged to master. (https://github.com/apache/airflow/pull/8942) Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637654316


   Thank you for the tips! 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pujaji commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
pujaji commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628239217


   @samuelkhtu No Samuel You have got me wrong. I have stopped working on this. I urge you to accomplish this ans present this to the community


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #8525:
URL: https://github.com/apache/airflow/issues/8525


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu edited a comment on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629226528


   Thank you @jeffolsi . I see. I guess what you are looking for is sightly different than the Python version. Yes, the SQL query can decide True/False but we still need to let the operator knows which "branch" or path you want to follow right?  How about the following?
   
   
   BranchSqlOperator(
   conn_id: str,
   sql: str, 
   parameters: Optional, 
   follow_tasks_if_true: list, # list of task ids to follow if SQL return True,
   follow_tasks_if_false: list, # list of task ids to follow if SQL return False
   )
   
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637028324


   Hello @jeffolsi , the new operator is in. Maybe we can close this issue?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637641497


   Closed. For the future - it's enough to add `Closes #ISSUE` in the commit message so that the issue is closed automatically on merge :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jeffolsi commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
jeffolsi commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629287817


   @samuelkhtu exactly what i was thinking about


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-631871536


   Code and test completed for this item. I am working on the PR. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu removed a comment on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu removed a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628196134


   Thanks @jeffolsi & @potiuk 
   
   Since there are many SQL favors supported within Airflow, do you have a specific backend in mind? SQLite? Postgres? MySQL?  
   
   Maybe you are thinking a generic SQL branch operator using ODBC hook instead?
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pujaji commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
pujaji commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-622330585


   Hey! I would like to work on this


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pujaji edited a comment on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
pujaji edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628240078


   No regrets! Happy coding-contributing!! :+1: 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu edited a comment on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629226528


   Thank you @jeffolsi . I see. I guess what you are looking for is sightly different than the Python version. Yes, the SQL query can decide True/False but we still need to let the operator knows which "branch" or path you want to follow right?  How about the following?
   
   ``` Python
   BranchSqlOperator(
       conn_id: str,
       sql: str, 
       parameters: Optional, 
       follow_task_Ids_if_true: list, # list of task ids to follow if SQL return True,
       follow_task_Ids_if_false: list, # list of task ids to follow if SQL return False
   )
   ```
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629029045


   Hey @jeffolsi , quick question for you. In the existing Airflow Python Branching Operator, the python callback function will return the 'task_id' or list of 'tasl_ids' for selecting the branching to follow.
   
   I am just wondering  if you would like to use the SQL query to select the branches as well?  
   
   For example, the SQL query "SELECT 'branch_a', 'branch_b' will return 2 columns and the SQLBranchOperator will follow branch_a and branch_b. (branch_a and branch_b are task_ids within the DAG)
   
   Or you expect the SQL query to return multiple rows, each row will represent the task_id within the DAG?
   
   Thanks
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628196134


   Thanks @jeffolsi & @potiuk 
   
   Since there are many SQL favors supported within Airflow, do you have a specific backend in mind? SQLite? Postgres? MySQL?  
   
   Maybe you are thinking a generic SQL branch operator using ODBC hook instead?
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-618245385


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-632786479


   Quick update. The PR is under review. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637641497


   Closed. For the future - it's enough to add Closes #ISSUE in the commit message so that the issue is closed automatically on merge :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629226528


   Thank you @jeffolsi . I see. I guess what you are looking for is sightly different than the Python version. Yes, the SQL query can decide True/False but we still need to let the operator knows which "branch" or path you want to follow right?  How about the following?
   
   `
   BranchSqlOperator(
   conn_id: str,
   sql: str, 
   parameters: Optional, 
   follow_tasks_if_true: list, # list of task ids to follow if SQL return True,
   follow_tasks_if_false: list, # list of task ids to follow if SQL return False
   )
   `
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629296674


   Great. Thank you @jeffolsi ! 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pujaji commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
pujaji commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628240078


   No regrets! Happy coding-contributing!! :dagger: 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jeffolsi commented on issue #8525: SQLBranchOperator

Posted by GitBox <gi...@apache.org>.
jeffolsi commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629204930


   @samuelkhtu the query return only true / false.
   Think of the query as the equivalent of the python callable. The callable return only true false and with that the follow branch is decided.
   
   I don't think it's a good idea to combine the task_id into the sql.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org