You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/23 07:59:59 UTC
[GitHub] [airflow] jeffolsi opened a new issue #8525: SQLBranchOperator
jeffolsi opened a new issue #8525:
URL: https://github.com/apache/airflow/issues/8525
Airflow has SQLSensor and PythonBranchOperator it seems that the logic of both can be combined to create SQLBranchOperator
SQLSenor knows to take single sql query and wait for condition on it. it can be copied and changed so that it won't wait (like sensor) but simply return true false. According to the value returned from the query it will follow the chosen branch.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu edited a comment on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629226528
Thank you @jeffolsi . I see. I guess what you are looking for is sightly different than the Python version. Yes, the SQL query can decide True/False but we still need to let the operator knows which "branch" or path you want to follow right? How about the following?
``` Python
BranchSqlOperator(
conn_id: str,
sql: str,
parameters: Optional,
follow_tasks_if_true: list, # list of task ids to follow if SQL return True,
follow_tasks_if_false: list, # list of task ids to follow if SQL return False
)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637638996
Hi @mik-laj , can you help and close this item? Thanks.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-634136893
Hi @potiuk , can you help and comment on a question raised by @eladkal in this PR? https://github.com/apache/airflow/pull/8942
I believe this PR is ready to go otherwise.
Thanks!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-622334540
Please do !
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628197562
Thank you both!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628236594
Thanks @pujaji , sounds like you've got this. I will circle back in a few days. Can't wait to see this new operator!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628239493
Hi @pujaji , I am sorry I misunderstood you. Sounds good. I will give this a try!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628195754
Yep. Thanks @jeffolsi for the explanation. Yep. I think this is about right.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu edited a comment on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637638996
Hi @mik-laj , can you help and close this item? This operator is merged to master. (https://github.com/apache/airflow/pull/8942) Thanks.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637654316
Thank you for the tips!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] pujaji commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
pujaji commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628239217
@samuelkhtu No Samuel You have got me wrong. I have stopped working on this. I urge you to accomplish this ans present this to the community
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk closed issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
potiuk closed issue #8525:
URL: https://github.com/apache/airflow/issues/8525
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu edited a comment on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629226528
Thank you @jeffolsi . I see. I guess what you are looking for is sightly different than the Python version. Yes, the SQL query can decide True/False but we still need to let the operator knows which "branch" or path you want to follow right? How about the following?
BranchSqlOperator(
conn_id: str,
sql: str,
parameters: Optional,
follow_tasks_if_true: list, # list of task ids to follow if SQL return True,
follow_tasks_if_false: list, # list of task ids to follow if SQL return False
)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637028324
Hello @jeffolsi , the new operator is in. Maybe we can close this issue?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk edited a comment on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637641497
Closed. For the future - it's enough to add `Closes #ISSUE` in the commit message so that the issue is closed automatically on merge :)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jeffolsi commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
jeffolsi commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629287817
@samuelkhtu exactly what i was thinking about
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-631871536
Code and test completed for this item. I am working on the PR.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu removed a comment on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu removed a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628196134
Thanks @jeffolsi & @potiuk
Since there are many SQL favors supported within Airflow, do you have a specific backend in mind? SQLite? Postgres? MySQL?
Maybe you are thinking a generic SQL branch operator using ODBC hook instead?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] pujaji commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
pujaji commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-622330585
Hey! I would like to work on this
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] pujaji edited a comment on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
pujaji edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628240078
No regrets! Happy coding-contributing!! :+1:
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu edited a comment on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu edited a comment on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629226528
Thank you @jeffolsi . I see. I guess what you are looking for is sightly different than the Python version. Yes, the SQL query can decide True/False but we still need to let the operator knows which "branch" or path you want to follow right? How about the following?
``` Python
BranchSqlOperator(
conn_id: str,
sql: str,
parameters: Optional,
follow_task_Ids_if_true: list, # list of task ids to follow if SQL return True,
follow_task_Ids_if_false: list, # list of task ids to follow if SQL return False
)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629029045
Hey @jeffolsi , quick question for you. In the existing Airflow Python Branching Operator, the python callback function will return the 'task_id' or list of 'tasl_ids' for selecting the branching to follow.
I am just wondering if you would like to use the SQL query to select the branches as well?
For example, the SQL query "SELECT 'branch_a', 'branch_b' will return 2 columns and the SQLBranchOperator will follow branch_a and branch_b. (branch_a and branch_b are task_ids within the DAG)
Or you expect the SQL query to return multiple rows, each row will represent the task_id within the DAG?
Thanks
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628196134
Thanks @jeffolsi & @potiuk
Since there are many SQL favors supported within Airflow, do you have a specific backend in mind? SQLite? Postgres? MySQL?
Maybe you are thinking a generic SQL branch operator using ODBC hook instead?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-618245385
Thanks for opening your first issue here! Be sure to follow the issue template!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-632786479
Quick update. The PR is under review.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-637641497
Closed. For the future - it's enough to add Closes #ISSUE in the commit message so that the issue is closed automatically on merge :)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629226528
Thank you @jeffolsi . I see. I guess what you are looking for is sightly different than the Python version. Yes, the SQL query can decide True/False but we still need to let the operator knows which "branch" or path you want to follow right? How about the following?
`
BranchSqlOperator(
conn_id: str,
sql: str,
parameters: Optional,
follow_tasks_if_true: list, # list of task ids to follow if SQL return True,
follow_tasks_if_false: list, # list of task ids to follow if SQL return False
)
`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] samuelkhtu commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
samuelkhtu commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629296674
Great. Thank you @jeffolsi !
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] pujaji commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
pujaji commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-628240078
No regrets! Happy coding-contributing!! :dagger:
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jeffolsi commented on issue #8525: SQLBranchOperator
Posted by GitBox <gi...@apache.org>.
jeffolsi commented on issue #8525:
URL: https://github.com/apache/airflow/issues/8525#issuecomment-629204930
@samuelkhtu the query return only true / false.
Think of the query as the equivalent of the python callable. The callable return only true false and with that the follow branch is decided.
I don't think it's a good idea to combine the task_id into the sql.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org