You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/07/28 14:57:50 UTC

[GitHub] [airflow] moreaupascal56 opened a new issue, #25373: SimpleHttpOperator: have an option to load data by batches

moreaupascal56 opened a new issue, #25373:
URL: https://github.com/apache/airflow/issues/25373

   ### Description
   
   `SimpleHttpOperator` is useful to leverage Airflow http connections. 
   But right now the operator can only be used to load data at once, and not by batches. It would be useful to be able to load the data by batches inside one `SimpleHttpOperator` instead of having several `SimpleHttpOperator`. 
   
   
   
   ### Use case/motivation
   
   Some APIs have limit of row for one PUT request, it means that in that case we would have to use several `SimpleHttpOperator`  (N* the limit of the API). It would be easier to have only one (or a new http operator) that would load by batches. 
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on issue #25373: SimpleHttpOperator: have an option to load data by batches

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #25373:
URL: https://github.com/apache/airflow/issues/25373#issuecomment-1198263674

   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] moreaupascal56 closed issue #25373: SimpleHttpOperator: have an option to load data by batches

Posted by GitBox <gi...@apache.org>.
moreaupascal56 closed issue #25373: SimpleHttpOperator: have an option to load data by batches 
URL: https://github.com/apache/airflow/issues/25373


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] moreaupascal56 commented on issue #25373: SimpleHttpOperator: have an option to load data by batches

Posted by GitBox <gi...@apache.org>.
moreaupascal56 commented on issue #25373:
URL: https://github.com/apache/airflow/issues/25373#issuecomment-1202156980

   thanks @potiuk for your answer, yes with a little more thinking this seems too complicated, it will be better to use something else. Have a great day, closing this :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #25373: SimpleHttpOperator: have an option to load data by batches

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #25373:
URL: https://github.com/apache/airflow/issues/25373#issuecomment-1198323423

   I am not sure what interface you are proposing and why. I am not sure if it can be useful without seeing the interface, but if you want to make a PR proposal - feel free to attempt it - but I think it might be difficult to design it in a useful way. Simple Http Operator has generally rather little use when it comes to parallel or massive requests., especially to download some largel amounts of data because you then have to pass the data to other tasks.
   
   Note that even today you can use @task decorator and either iterate over or parallelise requests and download data in batches using HttpHooks instead of trying to use it with a Simple HTTP Operator. IMHO it is far more efficient followin the pattern I described in this blog post - https://medium.com/apache-airflow/generic-airflow-transfers-made-easy-5fe8e5e7d2c2 . I assign you to it, but please take a look and consider if what you want to do is at all needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org