You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ankushpurwar (via GitHub)" <gi...@apache.org> on 2023/03/03 08:03:20 UTC

[GitHub] [airflow] ankushpurwar opened a new issue, #29893: Airflow REST API: Add generic capability to retrieve required information for the specified entity..

ankushpurwar opened a new issue, #29893:
URL: https://github.com/apache/airflow/issues/29893

   ### Description
   
   Airflow REST API should add a generic capability to retrieve the required information only. Instead sending all of them.
   E.g. If I want to retrieve DAG Run details using REST API: https://airflow.apache.org/api/v1/dags/{dag_id}/dagRuns/{dag_run_id}
   Or want to fetch list of DAGs using RET API: https://airflow.apache.org/api/v1/dags
   
   It always returns the full details. Often it is the case where caller is not interested in all the information.
   
   So I suggest to add a generic capability to retrieve only needed information just like offset and limit. 
   E.g. if we pass fields = {dag_id, is_paused} in the query parameter while calling https://airflow.apache.org/api/v1/dags API, So it returns JSON body contains {dag_id, is_paused} fields. 
   
   Similarly it is true for other end points as well (At least Get Ones)
   
   
   ### Use case/motivation
   
   1. Optimize the information what we want to retrieve from server. 
   2. Saving Network bandwidth by reducing the information to required one.
   3. Possibility to collect more data in one go. 
   
   ### Related issues
   
   Cannot say.
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Airflow REST API: Add generic capability to retrieve required information for the specified entity.. [airflow]

Posted by "HarryWu-CHN (via GitHub)" <gi...@apache.org>.
HarryWu-CHN commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1874805047

   @hussein-awala @maahir22 
   Hello I would like to contribute to this issue, could I be assigned it?
   While I am new to airflow, can I get some help? I can locate `airflow/api_connexion/endpoints/dag_endpoint.get_dags`, but who called this function? I saw `SQLAlchemySchema.dump` is used directly as a return, how to extract the required field is a good practice?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] maahir22 commented on issue #29893: Airflow REST API: Add generic capability to retrieve required information for the specified entity..

Posted by "maahir22 (via GitHub)" <gi...@apache.org>.
maahir22 commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1484099302

   Awesome, let me know if you need any help!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on issue #29893: Airflow REST API: Add generic capability to retrieve required information for the specified entity..

Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1453127983

   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hussein-awala commented on issue #29893: Airflow REST API: Add generic capability to retrieve required information for the specified entity..

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1468850041

   @zazemlenie Sure, I assigned it to you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Airflow REST API: Add generic capability to retrieve required information for the specified entity.. [airflow]

Posted by "HarryWu-CHN (via GitHub)" <gi...@apache.org>.
HarryWu-CHN commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1875515968

   @potiuk 
   Thank you for telling me about the task in detail!🌸 But just for `dags` or `dagRuns`, isn't it OK to just add 'only' parameter when the Schema() is created.
   ```
   dag_schema = DAGSchema(only=fields)
   return dag_schema.dump(dag, )
   ```
   And add `nullable: true` to `airflow/api_connexion/openapi/v1.yaml` in returns properties.
   
   I think it's hard to solve this task generally for now because swagger yaml files are not automatically generated from the schema. If it can be generated automatically, yaml can also set `nullable` values based on whether the schema field `required` is true or not


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Airflow REST API: Add generic capability to retrieve required information for the specified entity.. [airflow]

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1875521578

   This issue is about `generic` funcrtionality. If you want to do only dags or dagRuns limited version - feel free to open PRs with fixes - but they would not close that issue. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hussein-awala commented on issue #29893: Airflow REST API: Add generic capability to retrieve required information for the specified entity..

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1454326434

   Sounds like a good feature, want to work on it and be an Airflow contributor?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Airflow REST API: Add generic capability to retrieve required information for the specified entity.. [airflow]

Posted by "HarryWu-CHN (via GitHub)" <gi...@apache.org>.
HarryWu-CHN commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1875382856

   What is the meaning of `POC`, please? As you mention `implement a POC`.
   @potiuk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] maahir22 commented on issue #29893: Airflow REST API: Add generic capability to retrieve required information for the specified entity..

Posted by "maahir22 (via GitHub)" <gi...@apache.org>.
maahir22 commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1484095290

   Has there been any development on this? Would like to contribute if possible, are we planning to integrate the functionality of fetching only specific fields for every GET end-point?  Won't there be an issue with the query string getting too long, or do we plan to impose limits on the granularity of fields that can be fetched? @hussein-awala @zazemlenie 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] zazemlenie commented on issue #29893: Airflow REST API: Add generic capability to retrieve required information for the specified entity..

Posted by "zazemlenie (via GitHub)" <gi...@apache.org>.
zazemlenie commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1484099005

   I'm working on this issue. I haven't run into the query string issue you mentioned, but I'll check it out more precisely


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] zazemlenie commented on issue #29893: Airflow REST API: Add generic capability to retrieve required information for the specified entity..

Posted by "zazemlenie (via GitHub)" <gi...@apache.org>.
zazemlenie commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1468844950

   Can I work on this issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Airflow REST API: Add generic capability to retrieve required information for the specified entity.. [airflow]

Posted by "jackkolbert (via GitHub)" <gi...@apache.org>.
jackkolbert commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1806474813

   Hi, I would like to contribute to this issue, could I be assigned it? Thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Airflow REST API: Add generic capability to retrieve required information for the specified entity.. [airflow]

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1875390569

   Proof Of Concept.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Airflow REST API: Add generic capability to retrieve required information for the specified entity.. [airflow]

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #29893:
URL: https://github.com/apache/airflow/issues/29893#issuecomment-1875024049

   I assigned you - but part of the task is to propose how to do it. Generally speaking, generic retrieval/update of partial information is somethingh that GraphQL attempted to do as the "next gen" API, attempting to "fix" what REST got broken.
   
   https://graphql.org/
   
   However, my personal opinion (and of many people) is that GraphQL is quite a bit TOO generic. It is relatively popular and used in quite a few places - but mostly in "corporate" world and big installation because - unlike REST it is not intuitive and learning curve is, welll, steep IMHO. I never gotten to be thrilled with the idea of learning more about GraphQL and getting the hang of it personally. Also it tried to address all-but-kitchen-sink aspects of the API (including rate limiting, introspection, etc. . in most of the implementations are very difficult to get performance right and there are plenty of other issues with it.
   
   You can read for example here https://blog.logrocket.com/graphql-vs-rest-api-why-you-shouldnt-use-graphql/
   
   IMHO (but this is my opinion) - we need something much simpler and straightforward here and rather then defining and following a "standard", we should possibly tap into other people doing similar things - because our API is described with OpenAPI definition and our REST points documentation and swagger UI and everything we have in the API is generated. That's especially important as our Clients (notably https://github.com/apache/airflow-client-python) are generated using OpenAPI client generator that translates the OpenAPI specification into Python classes that you can import and use directly. This goes for other languages as well.
   
   This is a bit tricky, because the generator produces objects returned, so if API returns partial objects, then it cannot return ACTUAL OBJECTS. It can return dictionaries for example, or some Proxy Objects that actually only contain part of data and the rest of the data **might** be retrieved lazily.
   
   So finding a way how to do it so that it is:
   
   a) simple
   b) builds on top of REST not changing it to GraphQL
   c) nicely integrates with OpenaAPI definition, Swagger
   d) integrates with Open API generators to allow such partial retrieval
   
   So this task is really:
   
   * to research what is out there
   * to see if it is mature enough to use or maybe we should develop something on our own - either inspired by things out there or design it from the scratch
   * implement a POC
   * run a project to apply it to all end-points, starting from the most important ones
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org