You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2019/06/28 20:03:13 UTC

[GitHub] [airflow] davlum edited a comment on issue #5481: [AIRFLOW-4851] Refactor K8S codebase with k8s API models

davlum edited a comment on issue #5481: [AIRFLOW-4851] Refactor K8S codebase with k8s API models
URL: https://github.com/apache/airflow/pull/5481#issuecomment-506757977

@ashb @pgagnon I have a couple thoughts, wondering if I could get some feedback. Currently there are four places where a pod can be configured/created that I am aware of.
1. From the `airflow.cfg` with `KubernetesExecutor`.
2. From an Operator with `KubernetesExecutor` using the argument ```executor_config = { 'KubernetesExecutor': { ... }}```
3. From the `KubernetesPodoperator`.
4. From the `pod_mutation_hook`.

Ideally there'd be one interface for all of these, whereas currently there seems to be several. 1. uses `WorkerConfiguration`, 2. uses `KubernetesExecutorConfig`, 3. uses `PodGenerator`. Each of these in turn offer a different level of coverage of the Kubernetes API, which then creates our custom `Pod` object which itself must implement all parts of the Kubernetes API and offer a serialization method into JSON which conforms with the API. Ideally we would offer a very thin layer of abstraction over the creation of a `V1Pod`object for convenience (and have this be largely backwards compatible) _and_ offer creating `V1Pod` object totally raw. This is similar to how in an ORM the abstraction doesn't offer every possible feature of SQL, and you might need to write raw SQL.

My question would be regarding backwards incompatible changes. For example, the `KubernetesPodOperator` takes some of Airflow's internal kubernetes models as arguments, such as `list[airflow.kubernetes.pod.Port]`. I have less hesitation about changing that as it resides in `/contrib`. As for the `executor_config`, I think I can arrange to have it backwards compatible for the most part I believe.

The ideal scenario in my mind is that the full `V1Pod` be exposed to users if they need, which would address tickets such as [AIRLFOW-4454](https://issues.apache.org/jira/browse/AIRFLOW-4454) and [AIRFLOW-3152](https://issues.apache.org/jira/browse/AIRFLOW-3152) as they would have full access to the API. In another ticket, we could add functionality to just pass in configuration as a JSON/YAML string/file, which was discussed briefly [in the mailing list](https://lists.apache.org/thread.html/313132da516fca340243f4927e64177ea393d5ccae829d96b99f8e16@%3Cdev.airflow.apache.org%3E).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services