You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ferruzzi (via GitHub)" <gi...@apache.org> on 2023/02/14 00:04:02 UTC

[GitHub] [airflow] ferruzzi opened a new pull request, #29521: Breeze OpenTelemetry Integration

ferruzzi opened a new pull request, #29521:
URL: https://github.com/apache/airflow/pull/29521

   This PR builds on the recently-merged [Breeze StatsD Integration](https://github.com/apache/airflow/pull/29449) PR.  
   
   [#29449 ](https://github.com/apache/airflow/pull/29449) established a Docker Compose file which launched everything needed to emit metrics over StatsD and view them in Grafana.  This PR lays the groundwork for the same process but uses OpenTelemetry instead of StatsD.  It will be a required test environment for upcoming work on [AIP-49 OpenTelemetry Support for Apache Airflow](https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow).
   
   I included a disclaimer in the documentation page that the underlying work to make this useful is pending.
   
   The first commit [b84e96c](https://github.com/apache/airflow/commit/b84e96ce628e9e4a4457f5d58e95142c3101ca52) makes some adjustments to the previous StatsD PR in order to make them more reusable.  Second commit [8092112](https://github.com/apache/airflow/commit/809211299c500fb2c8c1eeaee649e9cdf25cbdbc) implements the new work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #29521: Breeze OpenTelemetry Integration

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #29521:
URL: https://github.com/apache/airflow/pull/29521#issuecomment-1430295579

   Actually  - if my guess is right, then we have to fix it and NOT use httpbin in our tests - this means that they rely on external service we have no control over and SOMEONE my deliberately actively impair our CI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #29521: Breeze OpenTelemetry Integration

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #29521:
URL: https://github.com/apache/airflow/pull/29521#issuecomment-1430637315

   > ```
   > FAILED tests/providers/http/hooks/test_http.py::test_do_api_call_async_retryable_error - aiohttp.client_exceptions.ClientConnectionError: Connection refused: GET http://httpbin.org/non_existent_endpoint
   > ```
   > 
   > I didn't copy the error message from before, but it looks familiar, I think it's the same transient one @potiuk mentioned.
   
   I tihnk this is simply some kind of instability of the `http://httpbin.org/` - or maybe they (or their systems) **THINK** that we are DDOS-ing them with our CI and block the attempt if there are too many.
   
   With Airflow's scale of PRs and CI it's actually even likely :D 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on pull request #29521: Breeze OpenTelemetry Integration

Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on PR #29521:
URL: https://github.com/apache/airflow/pull/29521#issuecomment-1430288818

   Main is broken again, should have not rebased yet I guess.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on pull request #29521: Breeze OpenTelemetry Integration

Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on PR #29521:
URL: https://github.com/apache/airflow/pull/29521#issuecomment-1430662550

   It just passed, so unless someone fixed something then it's intermittent.   :+1: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #29521: Breeze OpenTelemetry Integration

Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on code in PR #29521:
URL: https://github.com/apache/airflow/pull/29521#discussion_r1106112793


##########
scripts/ci/docker-compose/otel-collector-config.yml:
##########
@@ -0,0 +1,56 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+extensions:
+  memory_ballast:
+    size_mib: 512
+  zpages:
+    endpoint: 0.0.0.0:55679
+
+receivers:
+  otlp:
+    protocols:
+      grpc:
+      http:
+
+processors:
+  batch:
+  memory_limiter:
+    # 75% of maximum memory up to 4G
+    limit_mib: 1536

Review Comment:
   You are absolutely right.   I copied that block from a guide and totally missed that.  I'll fix or remove the comment



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #29521: Breeze OpenTelemetry Integration

Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on code in PR #29521:
URL: https://github.com/apache/airflow/pull/29521#discussion_r1106112793


##########
scripts/ci/docker-compose/otel-collector-config.yml:
##########
@@ -0,0 +1,56 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+extensions:
+  memory_ballast:
+    size_mib: 512
+  zpages:
+    endpoint: 0.0.0.0:55679
+
+receivers:
+  otlp:
+    protocols:
+      grpc:
+      http:
+
+processors:
+  batch:
+  memory_limiter:
+    # 75% of maximum memory up to 4G
+    limit_mib: 1536

Review Comment:
   You are absolutely right.   I copied that block from a guide and totally missed that.  I'll fix it here and in the statsd-intergration 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #29521: Breeze OpenTelemetry Integration

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk merged PR #29521:
URL: https://github.com/apache/airflow/pull/29521


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] o-nikolas commented on a diff in pull request #29521: Breeze OpenTelemetry Integration

Posted by "o-nikolas (via GitHub)" <gi...@apache.org>.
o-nikolas commented on code in PR #29521:
URL: https://github.com/apache/airflow/pull/29521#discussion_r1105311973


##########
scripts/ci/docker-compose/integration-otel.yml:
##########
@@ -0,0 +1,67 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+version: "3.7"
+services:
+  otel-collector:
+    image: otel/opentelemetry-collector:0.70.0
+    container_name: "breeze-opentelemetry-collector"
+    ports:
+      - "1888:1888"     # pprof extension
+      - "28888:8888"    # Prometheus metrics exposed by the collector
+      - "28889:8889"    # Prometheus exporter metrics
+      - "13133:13133"   # health_check extension
+      - "4317:4317"     # OTLP gRPC receiver
+      - "4318:4318"     # OTLP http receiver
+      - "55679:55679"   # zpages extension
+
+  prometheus:
+    image: prom/prometheus
+    container_name: "breeze-prometheus"
+    user: "0"
+    ports:
+      - "29090:9090"
+    volumes:
+      - ./prometheus-config.yml:/etc/prometheus/prometheus.yml
+      - ./prometheus/volume:/prometheus
+
+  grafana:
+    image: grafana/grafana:8.2.4
+    container_name: "breeze-grafana"
+    environment:
+      GF_AUTH_ANONYMOUS_ENABLED: true
+      GF_AUTH_ANONYMOUS_ORG_NAME: "Main Org."
+      GF_AUTH_ANONYMOUS_ORG_ROLE: "Admin"
+      GF_PATHS_PROVISIONING: /grafana/provisioning
+    ports:
+      - "23000:3000"
+    volumes:
+      - ./grafana/volume/data:/grafana
+      - ./grafana/volume/datasources:/grafana/datasources
+      - ./grafana/volume/dashboards:/grafana/dashboards
+      - ./grafana/volume/provisioning:/grafana/provisioning
+
+  airflow:
+    environment:
+      - INTEGRATION_STATSD=true
+      - AIRFLOW__METRICS__STATSD_ON=True
+      - AIRFLOW__METRICS__STATSD_HOST=statsd-exporter
+      - AIRFLOW__METRICS__STATSD_PORT=9125

Review Comment:
   Are these statsd values being set temporary? Since this is the OTEL integration I would expect to see OTEL env vars here not statsd



##########
scripts/ci/docker-compose/otel-collector-config.yml:
##########
@@ -0,0 +1,56 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+extensions:
+  memory_ballast:
+    size_mib: 512
+  zpages:
+    endpoint: 0.0.0.0:55679
+
+receivers:
+  otlp:
+    protocols:
+      grpc:
+      http:
+
+processors:
+  batch:
+  memory_limiter:
+    # 75% of maximum memory up to 4G
+    limit_mib: 1536

Review Comment:
   This limit doesn't match the comment?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on pull request #29521: Breeze OpenTelemetry Integration

Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on PR #29521:
URL: https://github.com/apache/airflow/pull/29521#issuecomment-1430325621

   ```
   FAILED tests/providers/http/hooks/test_http.py::test_do_api_call_async_retryable_error - aiohttp.client_exceptions.ClientConnectionError: Connection refused: GET http://httpbin.org/non_existent_endpoint
   ```
   
   I didn't copy the error message from before, but it looks familiar, I think it's the same transient one @potiuk mentioned.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on pull request #29521: Breeze OpenTelemetry Integration

Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on PR #29521:
URL: https://github.com/apache/airflow/pull/29521#issuecomment-1430292132

   Cool.  I peeked in the PRs list and there were a LOT of them failing so I assumed it was another main hiccup.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #29521: Breeze OpenTelemetry Integration

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #29521:
URL: https://github.com/apache/airflow/pull/29521#issuecomment-1430290806

   looks transient error with httpbin. rerunning.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #29521: Breeze OpenTelemetry Integration

Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on code in PR #29521:
URL: https://github.com/apache/airflow/pull/29521#discussion_r1106111684


##########
scripts/ci/docker-compose/integration-otel.yml:
##########
@@ -0,0 +1,67 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+version: "3.7"
+services:
+  otel-collector:
+    image: otel/opentelemetry-collector:0.70.0
+    container_name: "breeze-opentelemetry-collector"
+    ports:
+      - "1888:1888"     # pprof extension
+      - "28888:8888"    # Prometheus metrics exposed by the collector
+      - "28889:8889"    # Prometheus exporter metrics
+      - "13133:13133"   # health_check extension
+      - "4317:4317"     # OTLP gRPC receiver
+      - "4318:4318"     # OTLP http receiver
+      - "55679:55679"   # zpages extension
+
+  prometheus:
+    image: prom/prometheus
+    container_name: "breeze-prometheus"
+    user: "0"
+    ports:
+      - "29090:9090"
+    volumes:
+      - ./prometheus-config.yml:/etc/prometheus/prometheus.yml
+      - ./prometheus/volume:/prometheus
+
+  grafana:
+    image: grafana/grafana:8.2.4
+    container_name: "breeze-grafana"
+    environment:
+      GF_AUTH_ANONYMOUS_ENABLED: true
+      GF_AUTH_ANONYMOUS_ORG_NAME: "Main Org."
+      GF_AUTH_ANONYMOUS_ORG_ROLE: "Admin"
+      GF_PATHS_PROVISIONING: /grafana/provisioning
+    ports:
+      - "23000:3000"
+    volumes:
+      - ./grafana/volume/data:/grafana
+      - ./grafana/volume/datasources:/grafana/datasources
+      - ./grafana/volume/dashboards:/grafana/dashboards
+      - ./grafana/volume/provisioning:/grafana/provisioning
+
+  airflow:
+    environment:
+      - INTEGRATION_STATSD=true
+      - AIRFLOW__METRICS__STATSD_ON=True
+      - AIRFLOW__METRICS__STATSD_HOST=statsd-exporter
+      - AIRFLOW__METRICS__STATSD_PORT=9125

Review Comment:
   Correct, at the moment there is no code in Airflow that would accept other values so I left these for now, but in hindsight that doesn't make any sense I guess.  I'll remove them  and leave a note to see the statsd one for formatting once they are implemented



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #29521: Breeze OpenTelemetry Integration

Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on code in PR #29521:
URL: https://github.com/apache/airflow/pull/29521#discussion_r1106112793


##########
scripts/ci/docker-compose/otel-collector-config.yml:
##########
@@ -0,0 +1,56 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+extensions:
+  memory_ballast:
+    size_mib: 512
+  zpages:
+    endpoint: 0.0.0.0:55679
+
+receivers:
+  otlp:
+    protocols:
+      grpc:
+      http:
+
+processors:
+  batch:
+  memory_limiter:
+    # 75% of maximum memory up to 4G
+    limit_mib: 1536

Review Comment:
   You are absolutely right.   I copied that block from a guide and totally missed that.  I'll fix or remove the comment here and in the statsd-intergration 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on pull request #29521: Breeze OpenTelemetry Integration

Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on PR #29521:
URL: https://github.com/apache/airflow/pull/29521#issuecomment-1430147462

   Sorry for the force, I had to rebase


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org