You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/21 08:10:27 UTC

[GitHub] [airflow] dstandish opened a new pull request #13209: Upgrade keda to v2.0.0

dstandish opened a new pull request #13209:
URL: https://github.com/apache/airflow/pull/13209


   Here we upgrade keda to v2.0.0 in helm chart and clean up the docs a bit.
   
   Additionally, since 2.0.0 adds support for statefulsets, the ScaledObject definition is updated to add support for the case where `workers.persistence.enabled=True`.
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-757353203


   [The Workflow run](https://github.com/apache/airflow/actions/runs/474373695) is cancelling this PR. Building images for the PR has failed. Follow the the workflow link to check the reason.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589585640



##########
File path: chart/tests/helm_template_generator.py
##########
@@ -31,9 +32,12 @@
 
 BASE_URL_SPEC = "https://raw.githubusercontent.com/instrumenta/kubernetes-json-schema/master/v1.14.0"
 
+crd_lookup = {
+    'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml',  # noqa: E501 # pylint: disable=line-too-long
+}

Review comment:
       the above concatenation of api version and kind in this way, keda.sh/v1alpha1::ScaledObject is a made-up syntax that only serves to provide a way to look up the url for a particular combination of api version + kind




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589585640



##########
File path: chart/tests/helm_template_generator.py
##########
@@ -31,9 +32,12 @@
 
 BASE_URL_SPEC = "https://raw.githubusercontent.com/instrumenta/kubernetes-json-schema/master/v1.14.0"
 
+crd_lookup = {
+    'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml',  # noqa: E501 # pylint: disable=line-too-long
+}

Review comment:
       the above concatenation of api version and kind in this way, keda.sh/v1alpha1::ScaledObject is a made-up syntax that only serves to provide a simple way to look up the url for a particular combination of api version + kind for CRDs




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589585640



##########
File path: chart/tests/helm_template_generator.py
##########
@@ -31,9 +32,12 @@
 
 BASE_URL_SPEC = "https://raw.githubusercontent.com/instrumenta/kubernetes-json-schema/master/v1.14.0"
 
+crd_lookup = {
+    'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml',  # noqa: E501 # pylint: disable=line-too-long
+}

Review comment:
       the concatenation of api version and kind in this way, keda.sh/v1alpha1::ScaledObject is a made-up syntax that only serves to provide a way to look up the url for a particular combination of api version + kind




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-792869116


   forgot to push fixup adding `autoscaler` to spelling wordlist will merge afterward


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ianstanton commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
ianstanton commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-792993349


   > @mik-laj @ianstanton after taking another look at this i thought that adding the namedtuple in this case was maybe a little overkill / confusing because the elements are never accessed it's just a way constructing a key in the dict for the URL lookup, given the two components -- api version and kind.
   > 
   > can you please take a look and let me know what you think of this perhaps simpler (though less explicit) alternative.
   > 
   > and maybe you have a better suggestion.
   > 
   > ```python
   > crd_lookup = {
   >     'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml',
   > }
   > ...
   > def get_schema_crd(api_version, kind):
   >     url = crd_lookup.get(f"{api_version}::{kind}")
   >     if not url:
   >         return None
   >     response = requests.get(url)
   >     yaml_schema = response.content.decode('utf-8')
   >     schema = yaml.safe_load(StringIO(yaml_schema))
   >     return schema
   > ```
   > 
   > so i'm just joining the two components with `::` to make a string that's pretty straightforward to understand so that we don't need the extra K8sObject namedtuple. Also leaving the key as a regular tuple does work but end up much uglier with line breaks.
   > 
   > i'm gonna just go ahead and add this in a second commit and request review and go from there
   
   @dstandish this approach seems acceptable to me :+1: 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-792089456


   Can you do a rebase? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589007638



##########
File path: docs/helm-chart/keda.rst
##########
@@ -24,27 +24,23 @@ KEDA stands for Kubernetes Event Driven Autoscaling.
 `KEDA <https://github.com/kedacore/keda>`__ is a custom controller that
 allows users to create custom bindings to the Kubernetes `Horizontal Pod
 Autoscaler <https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/>`__.
-We have built scalers that allows users to create scalers based on

Review comment:
       Exactly it me and Daniel, so not only Astronomer. Also Airflow is on KEDA users page.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589585640



##########
File path: chart/tests/helm_template_generator.py
##########
@@ -31,9 +32,12 @@
 
 BASE_URL_SPEC = "https://raw.githubusercontent.com/instrumenta/kubernetes-json-schema/master/v1.14.0"
 
+crd_lookup = {
+    'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml',  # noqa: E501 # pylint: disable=line-too-long
+}

Review comment:
       the above concatenation of api version and kind in this way, keda.sh/v1alpha1::ScaledObject is a made-up syntax that only serves to provide a simple way to look up the url for a particular combination of api version + kind




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-792175597


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest master or amend the last commit of the PR, and push it with --force-with-lease.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589554875



##########
File path: docs/helm-chart/keda.rst
##########
@@ -54,19 +50,22 @@ to set ``worker.persistence.enabled`` to ``false``)
        --namespace airflow \
        --set executor=CeleryExecutor \
        --set workers.keda.enabled=true \
-       --set workers.persistence.enabled=false
+
+A ``ScaledObject`` and an ``hpa`` will be created in the airflow namespace.
 
 KEDA will derive the desired number of celery workers by querying
 Airflow metadata database:
 
-.. code-block:: none
+.. code-block:: sql
 

Review comment:
       pushed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-751333893


   @mik-laj following your comment here https://github.com/apache/airflow/pull/13183#issuecomment-748549890 i added a mechanism for validating CRD objects
   
   i enabled schema validation for the keda scaledojbect, and also removed the option to disable schema validation since it's no longer necessary


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-792114688


   Great -- will do


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ianstanton commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
ianstanton commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r586936226



##########
File path: chart/README.md
##########
@@ -323,22 +323,30 @@ helm install --name my-release \
 
 KEDA stands for Kubernetes Event Driven Autoscaling. [KEDA](https://github.com/kedacore/keda) is a custom controller that allows users to create custom bindings
 to the Kubernetes [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/).
-We've built a scaler that allows users to create scalers based on postgreSQL queries and shared it with the community. This enables us to scale the number of airflow workers deployed on Kubernetes by this chart depending on the number of task that are `queued` or `running`.
+
+The autoscaler will adjust the number of active celery workers based on the number of tasks in `queued` or `running` state.
+
+KEDA may be used with the `CeleryExecutor` and `CeleryKubernetesExecutor` executors.
+
+### Install KEDA
+
+First install into the cluster under namespace `keda`.
 
 ```bash
 helm repo add kedacore https://kedacore.github.io/charts
 
 helm repo update
 
-helm install \
-    --set image.keda=docker.io/kedacore/keda:1.2.0 \
-    --set image.metricsAdapter=docker.io/kedacore/keda-metrics-adapter:1.2.0 \
-    --namespace keda --name keda kedacore/keda
+kubectl create namespace keda
+
+helm install keda kedacore/keda \
+    --namespace keda \
+    --version "v2.0.0"
 ```
 
-Once KEDA is installed (which should be pretty quick since there is only one pod). You can try out KEDA autoscaling
-on this chart by setting `workers.keda.enabled=true` your helm command or in the `values.yaml`.
-(Note: KEDA does not support StatefulSets so you need to set `worker.persistence.enabled` to `false`)
+### Enable KEDA for airflow
+
+Enable for the airflow instance by setting `workers.keda.enabled=true` your helm command or in the `values.yaml`.

Review comment:
       ```suggestion
   Enable for the airflow instance by setting `workers.keda.enabled=true` in your helm command or in the `values.yaml`.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-792881023


   @mik-laj @ianstanton  after taking another look at this i thought that adding the namedtuple in this case was maybe a little overkill / confusing because the elements are never accessed it's just a way constructing a key in the dict for the URL lookup, given the two components -- api version and kind.
   
   can you please take a look and let me know what you think of this perhaps simpler (though less explicit) alternative.
   
   and maybe you have a better suggestion.
   ```python
   crd_lookup = {
       'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml',
   }
   ...
   def get_schema_crd(api_version, kind):
       url = crd_lookup.get(f"{api_version}::{kind}")
       if not url:
           return None
       response = requests.get(url)
       yaml_schema = response.content.decode('utf-8')
       schema = yaml.safe_load(StringIO(yaml_schema))
       return schema
   ```
   
   so i'm just joining the two components with `::` to make a string that's pretty straightforward to understand so that we don't need the extra K8sObject namedtuple. Also leaving the key as a regular tuple does work but end up much uglier with line breaks.
   
   i'm gonna just go ahead and add this in a second commit and request review and go from there


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589585392



##########
File path: chart/tests/helm_template_generator.py
##########
@@ -31,9 +32,12 @@
 
 BASE_URL_SPEC = "https://raw.githubusercontent.com/instrumenta/kubernetes-json-schema/master/v1.14.0"
 

Review comment:
       the concatenation of api version and kind in this way, `keda.sh/v1alpha1::ScaledObject` is a made-up syntax that only serves to provide a way to look up the url for a particular combination of  api version + kind




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-795929835


   @mik-laj @kaxil  One question about this.... This helm chart isn't "released" ... so I guess there's no need to put anything in `updating.md` for this update at this time.... but let me know if that's not true and you want me to add something somewhere....


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r588960938



##########
File path: chart/tests/helm_template_generator.py
##########
@@ -61,7 +88,7 @@ def validate_k8s_object(instance):
     validate.validate(instance)
 
 
-def render_chart(name="RELEASE-NAME", values=None, show_only=None, validate_schema=True):

Review comment:
       `validate_schema` was added recently to allow tests for CRDs but with this PR we have a way of pulling in the definition.
   it does require a network call here fwiw: https://github.com/apache/airflow/pull/13209/files#diff-91361141bde70cbfc90142ebe2377ce918025d3343d9216e00e50c66f083b12cR65

##########
File path: chart/templates/workers/worker-kedaautoscaler.yaml
##########
@@ -35,14 +35,18 @@ metadata:
 {{- end }}
 spec:
   scaleTargetRef:
-    deploymentName: {{ .Release.Name }}-worker
+    kind: {{ ternary "StatefulSet" "Deployment" .Values.workers.persistence.enabled }}
+    name: {{ .Release.Name }}-worker
   pollingInterval:  {{ .Values.workers.keda.pollingInterval }}   # Optional. Default: 30 seconds
   cooldownPeriod: {{ .Values.workers.keda.cooldownPeriod }}    # Optional. Default: 300 seconds
   maxReplicaCount: {{ .Values.workers.keda.maxReplicaCount }}   # Optional. Default: 100
   triggers:
     - type: postgresql
       metadata:
         targetQueryValue: "1"
-        connection: AIRFLOW_CONN_AIRFLOW_DB
-        query: "SELECT ceil(COUNT(*)::decimal / {{ .Values.config.celery.worker_concurrency }}) FROM task_instance WHERE state='running' OR state='queued'"

Review comment:
       while updating `connectionFromEnv` just decided to make this a bit more readable by splitting onto separate lines

##########
File path: docs/helm-chart/keda.rst
##########
@@ -24,27 +24,23 @@ KEDA stands for Kubernetes Event Driven Autoscaling.
 `KEDA <https://github.com/kedacore/keda>`__ is a custom controller that
 allows users to create custom bindings to the Kubernetes `Horizontal Pod
 Autoscaler <https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/>`__.
-We have built scalers that allows users to create scalers based on

Review comment:
       i removed this sentence because if i recall correctly the "we" is astronomer, so in this repo it doesn't really have the same meaning and i don't think the statement belongs.  but it's also sortof beside the point.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-773858692


   [The Workflow run](https://github.com/apache/airflow/actions/runs/539637438) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-796286575


   > @mik-laj @kaxil One question about this.... This helm chart isn't "released" ... so I guess there's no need to put anything in `updating.md` for this update at this time.... but let me know if that's not true and you want me to add something somewhere....
   
   Yup that is true, currently we don't need to put anything in Updating.md


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589551893



##########
File path: docs/helm-chart/keda.rst
##########
@@ -54,19 +50,22 @@ to set ``worker.persistence.enabled`` to ``false``)
        --namespace airflow \
        --set executor=CeleryExecutor \
        --set workers.keda.enabled=true \
-       --set workers.persistence.enabled=false
+
+A ``ScaledObject`` and an ``hpa`` will be created in the airflow namespace.
 
 KEDA will derive the desired number of celery workers by querying
 Airflow metadata database:
 
-.. code-block:: none
+.. code-block:: sql
 

Review comment:
       darn.  yes will fix that to.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-792881023


   @mik-laj @ianstanton  after taking another look at this i thought that adding the namedtuple in this case was maybe a little overkill / confusing because the elements are never accessed it's just a way constructing a key in the dict for the URL lookup, given the two components -- api version and kind.
   
   can you please take a look and let me know what you think of this perhaps simpler (though less explicit) alternative.
   
   and maybe you have a better suggestion.
   ```python
   crd_lookup = {
       'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml',
   }
   ...
   def get_schema_crd(api_version, kind):
       url = crd_lookup.get(f"{api_version}::{kind}")
       if not url:
           return None
       response = requests.get(url)
       yaml_schema = response.content.decode('utf-8')
       schema = yaml.safe_load(StringIO(yaml_schema))
       return schema
   ```
   
   so i'm just joining the two components with `::` to make a string that's pretty straightforward to understand so that we don't need the extra K8sObject namedtuple. Also leaving the key as a regular tuple does work but end up much uglier with line breaks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-792866398


   > Generally we can merge our selves after a review. Use your judgment if you need a re-review
   
   OK in that case since the rebase was just about docs and nothing materially changed i'll go ahead and merge.
   
   Thanks all


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish merged pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish merged pull request #13209:
URL: https://github.com/apache/airflow/pull/13209


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-792858528


   > Also what is the convention for committer-authors? Should the author merge it themselves after there is approval or is it expected that you wait for the reviewer to merge?
   
   Generally we can merge our selves after a review. Use your judgment if you need a re-review.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-773858692


   [The Workflow run](https://github.com/apache/airflow/actions/runs/539637438) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589585640



##########
File path: chart/tests/helm_template_generator.py
##########
@@ -31,9 +32,12 @@
 
 BASE_URL_SPEC = "https://raw.githubusercontent.com/instrumenta/kubernetes-json-schema/master/v1.14.0"
 
+crd_lookup = {
+    'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml',  # noqa: E501 # pylint: disable=line-too-long
+}

Review comment:
       the above concatenation of api version and kind in this way, keda.sh/v1alpha1::ScaledObject is a made-up syntax that only serves to provide a simple way to look up the url for a particular combination of api version + kind for CRDs
   
   it is only referenced in one place:
   https://github.com/apache/airflow/blob/3805e1ee4d6601956b92130ec7ae6bad9f2243a9/chart/tests/helm_template_generator.py#L57




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-792857671


   @mik-laj do you want to have another look post-rebase?
   
   Also what is the convention for committer-authors?  Should the author merge it themselves after there is approval or is it expected that you wait for the reviewer to merge?  
   
   Thank you 🙏


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589551893



##########
File path: docs/helm-chart/keda.rst
##########
@@ -54,19 +50,22 @@ to set ``worker.persistence.enabled`` to ``false``)
        --namespace airflow \
        --set executor=CeleryExecutor \
        --set workers.keda.enabled=true \
-       --set workers.persistence.enabled=false
+
+A ``ScaledObject`` and an ``hpa`` will be created in the airflow namespace.
 
 KEDA will derive the desired number of celery workers by querying
 Airflow metadata database:
 
-.. code-block:: none
+.. code-block:: sql
 

Review comment:
       darn.  yes will fix that too.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589061507



##########
File path: chart/README.md
##########
@@ -323,22 +323,30 @@ helm install --name my-release \
 
 KEDA stands for Kubernetes Event Driven Autoscaling. [KEDA](https://github.com/kedacore/keda) is a custom controller that allows users to create custom bindings
 to the Kubernetes [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/).
-We've built a scaler that allows users to create scalers based on postgreSQL queries and shared it with the community. This enables us to scale the number of airflow workers deployed on Kubernetes by this chart depending on the number of task that are `queued` or `running`.
+
+The autoscaler will adjust the number of active celery workers based on the number of tasks in `queued` or `running` state.
+
+KEDA may be used with the `CeleryExecutor` and `CeleryKubernetesExecutor` executors.
+
+### Install KEDA
+
+First install into the cluster under namespace `keda`.
 
 ```bash
 helm repo add kedacore https://kedacore.github.io/charts
 
 helm repo update
 
-helm install \
-    --set image.keda=docker.io/kedacore/keda:1.2.0 \
-    --set image.metricsAdapter=docker.io/kedacore/keda-metrics-adapter:1.2.0 \
-    --namespace keda --name keda kedacore/keda
+kubectl create namespace keda
+
+helm install keda kedacore/keda \
+    --namespace keda \
+    --version "v2.0.0"
 ```
 
-Once KEDA is installed (which should be pretty quick since there is only one pod). You can try out KEDA autoscaling
-on this chart by setting `workers.keda.enabled=true` your helm command or in the `values.yaml`.
-(Note: KEDA does not support StatefulSets so you need to set `worker.persistence.enabled` to `false`)
+### Enable KEDA for airflow
+
+Enable for the airflow instance by setting `workers.keda.enabled=true` your helm command or in the `values.yaml`.

Review comment:
       Fixed thanks




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589007638



##########
File path: docs/helm-chart/keda.rst
##########
@@ -24,27 +24,23 @@ KEDA stands for Kubernetes Event Driven Autoscaling.
 `KEDA <https://github.com/kedacore/keda>`__ is a custom controller that
 allows users to create custom bindings to the Kubernetes `Horizontal Pod
 Autoscaler <https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/>`__.
-We have built scalers that allows users to create scalers based on

Review comment:
       Exactly it was me and Daniel, so not only Astronomer. Also Airflow is on KEDA users page.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589575026



##########
File path: docs/helm-chart/keda.rst
##########
@@ -54,19 +50,22 @@ to set ``worker.persistence.enabled`` to ``false``)
        --namespace airflow \
        --set executor=CeleryExecutor \
        --set workers.keda.enabled=true \
-       --set workers.persistence.enabled=false
+
+A ``ScaledObject`` and an ``hpa`` will be created in the airflow namespace.
 
 KEDA will derive the desired number of celery workers by querying
 Airflow metadata database:
 
-.. code-block:: none
+.. code-block:: sql
 

Review comment:
       We could set it to `:: jinja` I think, but that probably expects it to be template to output HTML, so would likely be worse than no highlighting.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589551080



##########
File path: docs/helm-chart/keda.rst
##########
@@ -54,19 +50,22 @@ to set ``worker.persistence.enabled`` to ``false``)
        --namespace airflow \
        --set executor=CeleryExecutor \
        --set workers.keda.enabled=true \
-       --set workers.persistence.enabled=false
+
+A ``ScaledObject`` and an ``hpa`` will be created in the airflow namespace.
 
 KEDA will derive the desired number of celery workers by querying
 Airflow metadata database:
 
-.. code-block:: none
+.. code-block:: sql
 

Review comment:
       this does not work because it is not valid SQL. It is mix of Jinja and SQL. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589007638



##########
File path: docs/helm-chart/keda.rst
##########
@@ -24,27 +24,23 @@ KEDA stands for Kubernetes Event Driven Autoscaling.
 `KEDA <https://github.com/kedacore/keda>`__ is a custom controller that
 allows users to create custom bindings to the Kubernetes `Horizontal Pod
 Autoscaler <https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/>`__.
-We have built scalers that allows users to create scalers based on

Review comment:
       Exactly it was me and Daniel, so not only Astronomer. Also Airflow is on KEDA users page. But I agree that it creates not value here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-792881023


   @mik-laj @ianstanton  after taking another look at this i thought that adding the namedtuple in this case was maybe a little overkill / confusing because the elements are never accessed it's just a way constructing a key in the dict for the URL lookup, given the two components -- api version and kind.
   
   can you please take a look and let me know what you think of this perhaps simpler (though less explicit) alternative.
   
   and maybe you have a better suggestion.
   ```python
   crd_lookup = {
       'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml',
   }
   ...
   def get_schema_crd(api_version, kind):
       url = crd_lookup.get(f"{api_version}::{kind}")
       if not url:
           return None
       response = requests.get(url)
       yaml_schema = response.content.decode('utf-8')
       schema = yaml.safe_load(StringIO(yaml_schema))
       return schema
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#issuecomment-757353203


   [The Workflow run](https://github.com/apache/airflow/actions/runs/474373695) is cancelling this PR. Building images for the PR has failed. Follow the the workflow link to check the reason.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589584695



##########
File path: docs/helm-chart/keda.rst
##########
@@ -54,19 +50,22 @@ to set ``worker.persistence.enabled`` to ``false``)
        --namespace airflow \
        --set executor=CeleryExecutor \
        --set workers.keda.enabled=true \
-       --set workers.persistence.enabled=false
+
+A ``ScaledObject`` and an ``hpa`` will be created in the airflow namespace.
 
 KEDA will derive the desired number of celery workers by querying
 Airflow metadata database:
 
-.. code-block:: none
+.. code-block:: sql
 

Review comment:
       Yup, tested it (on terminal with `pygmentize -l jinja`) and it looks ugly.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13209: Add support for worker persistence with KEDA v2.0.0 in helm chart

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13209:
URL: https://github.com/apache/airflow/pull/13209#discussion_r589551080



##########
File path: docs/helm-chart/keda.rst
##########
@@ -54,19 +50,22 @@ to set ``worker.persistence.enabled`` to ``false``)
        --namespace airflow \
        --set executor=CeleryExecutor \
        --set workers.keda.enabled=true \
-       --set workers.persistence.enabled=false
+
+A ``ScaledObject`` and an ``hpa`` will be created in the airflow namespace.
 
 KEDA will derive the desired number of celery workers by querying
 Airflow metadata database:
 
-.. code-block:: none
+.. code-block:: sql
 

Review comment:
       this does not work because it is not valid SQL. It is mix of Jinja and SQL, so we should have 2 lexes here, but it is supported by pygments. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org