You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/12/17 18:49:50 UTC

[GitHub] [airflow] jonathanjuursema opened a new issue, #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

jonathanjuursema opened a new issue, #28010:
URL: https://github.com/apache/airflow/issues/28010

   ### Apache Airflow version
   
   2.4.3
   
   ### What happened
   
   When configuring Airflow/Celery to use Redis Sentinel as a broker, the following pops up:
   
   ```
   airflow.exceptions.AirflowException: The broker you configured does not support SSL_ACTIVE to be True. Please use RabbitMQ or Redis if you would like to use SSL for broker.
   ```
   
   ### What you think should happen instead
   
   Celery has supported TLS on Redis Sentinel [for a while](https://docs.celeryq.dev/en/latest/history/whatsnew-5.1.html#support-redis-sentinel-with-ssl) now.
   
   It looks like [this piece of code](https://github.com/apache/airflow/blob/main/airflow/config_templates/default_celery.py#L68-L88) explicitly prohibits from passing a valid Redis Sentinel TLS configuration through to Celery. (Since Sentinel broker URL's are prefixed with `sentinel://` instead of `redis://`.)
   
   ### How to reproduce
   
   This problem can be reproduced by deploying Airflow using Docker with the following environment variables:
   
   ```
   AIRFLOW__CELERY__BROKER_URL=sentinel://sentinel1:26379;sentinel://sentinel2:26379;sentinel://sentinel3:26379
   AIRFLOW__CELERY__SSL_ACTIVE=true
   AIRFLOW__CELERY_BROKER_TRANSPORT_OPTIONS__MASTER_NAME='some-master-name'
   AIRFLOW__CELERY_BROKER_TRANSPORT_OPTIONS__PASSWORD='some-password'
   AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG
   ```
   
   Note that I'm not 100% certain of the syntax for the password environment var. I can't get to the point of testing this because without TLS connections to our internal brokers are denied (because they require TLS), and with TLS it doesn't attempt a connection because of the earlier linked code.
   
   I've verified with the reference `redis-cli` that the settings we use for `master-name` does result in a valid response and the Sentinel set-up works as expected.
   
   ### Operating System
   
   Docker (apache/airflow:2.4.3-python3.10)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   Deployed using Nomad.
   
   ### Anything else
   
   This is my first issue with this open source project. Please let me know if there's more relevant information I can provide to follow through on this issue.
   
   I will try to make some time available soon to see if a simple code change in the earlier mentioned file would work, but as this is my first issue here I would still have to set-up a full development environment.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   If this is indeed a simple fix I'd be willing to look into making a PR. I would like some feedback on the problem first though if possible!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1336548821

   This is just default configuration. You can override it with your own dictionary:
   
   https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#celery-config-options


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jonathanjuursema commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
jonathanjuursema commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1346775177

   Of course! While trying to reproduce the issue the situation has changed. (I'm not sure why. I didn't commit the last configuration because I couldn't get it to work, so in reproducing I've started the process again. I'll make sure to save the config this time so that we can iterate on it if needed.)
   
   The docker containers for the worker, webserver and scheduler have the following environment variables set (all config is done via environment variables):
   ```
   (airflow)printenv | grep AIRFLOW
   AIRFLOW__CORE__HOSTNAME_CALLABLE=socket.gethostname
   AIRFLOW__CORE__LOAD_EXAMPLES=false
   AIRFLOW_INSTALLATION_METHOD=
   AIRFLOW_USER_HOME_DIR=/home/airflow
   AIRFLOW__SMTP__SMTP_PASSWORD=xxx
   AIRFLOW__SMTP__SMTP_HOST=xxx
   AIRFLOW_PIP_VERSION=22.3.1
   AIRFLOW__SMTP__SMTP_USER=xxx
   AIRFLOW__SMTP__SMTP_SSL=false
   AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL=3600
   AIRFLOW_HOME=/opt/airflow
   AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_CMD=/opt/airflow/airflow_construct_sql_conn_str.sh
   AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL=300
   AIRFLOW__SMTP__SMTP_PORT=587
   AIRFLOW__SMTP__SMTP_STARTTLS=true
   AIRFLOW_UID=50000
   AIRFLOW__API__AUTH_BACKENDS=airflow.api.auth.backend.basic_auth, airflow.api.auth.backend.session
   AIRFLOW__CORE__ENABLE_XCOM_PICKLING=true
   AIRFLOW__CELERY__CELERY_CONFIG_OPTIONS=retail_celery_config.CELERY_CONFIG
   AIRFLOW__CORE__EXECUTOR=CeleryExecutor
   AIRFLOW__CORE__FERNET_KEY=xxx
   AIRFLOW__SCHEDULER__CATCHUP_BY_DEFAULT=false
   AIRFLOW__CELERY__RESULT_BACKEND_CMD=/opt/airflow/airflow_construct_dbsql_conn_str.sh
   AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG
   AIRFLOW__WEBSERVER__WEB_SERVER_PORT=8080
   AIRFLOW_VERSION=2.4.3
   AIRFLOW__SMTP__SMTP_MAIL_FROM=xxx
   AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=true
   ```
   
   ```
   (airflow)printenv | grep CELERY
   CELERY_SSL_ACTIVE=true
   AIRFLOW__CELERY__CELERY_CONFIG_OPTIONS=retail_celery_config.CELERY_CONFIG
   AIRFLOW__CELERY__RESULT_BACKEND_CMD=/opt/airflow/airflow_construct_dbsql_conn_str.sh
   ```
   
   ```
   (airflow)printenv | grep REDIS
   REDIS_BROKER_MASTER_PASSWORD=xxx
   REDIS_BROKER_MASTER_NAME=xxx
   REDIS_BROKER_URL=sentinel://xxx:26379;sentinel://xxx:26379;sentinel://xxx:26379
   ```
   
   I've also mounted the following file in `/opt/airflow/config/retail_celery_config.py`:
   ```python
   from airflow.config_templates.default_celery import DEFAULT_CELERY_CONFIG
   import os
   
   CELERY_CONFIG = {
       **DEFAULT_CELERY_CONFIG,
       'broker_url': '{broker_url}?ssl_cert_reqs=none'.format(broker_url=os.getenv('REDIS_BROKER_URL')),
       'broker_transport_options': {
           'password': os.getenv('REDIS_BROKER_MASTER_PASSWORD'),
           'master_name': os.getenv('REDIS_BROKER_MASTER_NAME')
       }
   }
   ```
   
   What I now observe is interesting. I _think_ the scheduler is working. Neither the webserver and scheduler are throwing relevant errors, and the webserver doesn't show the "scheduler hasn't run in xxx minutes" banner. If there's additional checks I can do, please let me know.
   
   However, the worker still won't start:
   ```
   [2022-12-12 15:46:10,811: ERROR/MainProcess] consumer: Cannot connect to sentinel://xxx:26379//: No master found for 'xxx'.
   Will retry using next failover.
   
   [2022-12-12 15:46:10,828: ERROR/MainProcess] consumer: Cannot connect to sentinel://xxx:26379//: No master found for 'xxx'.
   Will retry using next failover.
   
   [2022-12-12 15:46:10,840: ERROR/MainProcess] consumer: Cannot connect to sentinel://xxx:26379//: No master found for 'xxx'.
   Trying again in 32.00 seconds... (16/100)
   ```
   
   This indicates that it _does_ fetch the right values (or at least the master name and sentinel list). Using the reference `redis-cli` I can validate the Redis configuration does work:
   
   ```
   ➜  src ./redis-cli -p 26379 --tls --insecure
   127.0.0.1:26379> sentinel get-master-addr-by-name non-existing-master-name
   (nil)
   127.0.0.1:26379> sentinel get-master-addr-by-name xxx
   1) "xx.xx.xx.xx"
   2) "7003"
   ```
   
   It should be noted that both the Sentinels and Redis Masters are using a non-public CA (to make things even worse). Either the scheduler/webserver accepts the `ssl_cert_reqs=none` from above and the worker doesn't, or the worker actually attempts an SSL connection, and the scheduler/webserver don't attempt one or don't log the attempts.
   
   It should also be noted that we use an internally managed Redis/Sentinel cluster (which I don't control, we're just a user). However, we have various other applications deployed using Redis in the same cluster and from the same application machines (effectively using the same firewall/network path), and the other applications do work as intended, so my first hunch is that the problem is not with the Redis/Sentinel cluster itself.
   
   For now I'm ignoring certificate validation, but if I can get this to work I'd like to mount the CA PEM and specify that in the broker URL.
   
   Please let me know if you need any additional information, snippets or if you have further troubleshooting ideas. The help is greatly appreciated!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jonathanjuursema commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
jonathanjuursema commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1339573092

   Hey @potiuk, thanks for getting back!
   
   I have studied the docs you've linked and did some more Googling. Based on [this Stackoverflow question](https://stackoverflow.com/questions/44979811/adding-extra-celery-configs-to-airflow) I've built an implementation for our usecase, also implementing [this](https://stackoverflow.com/questions/44979811/adding-extra-celery-configs-to-airflow#comment94305165_48075177) syntax for keeping the defaults.
   
   However, when I deploy this, the deployment works and both Airflow and Celery seem happy with the config, but it still doesn't work.
   
   When inspecting the [actual runtime config in the webinterface](https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#expose-config) I learn that it correctly reads the value for `AIRFLOW__CELERY__CELERY_CONFIG_OPTIONS`, but for `AIRFLOW__CELERY__BROKER_URL` it reverts back to the default. Even though we're specifying our own (Redis Sentinel) broker url in [this place](https://github.com/apache/airflow/blob/1.9.0/airflow/config_templates/default_celery.py#L30) in our own custom dict, the debug logging shows that Celery is trying to connect using the "default" connection string in `AIRFLOW__CELERY__BROKER_URL`. It seems to ignore the value from `AIRFLOW__CELERY__CELERY_CONFIG_OPTIONS`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.
URL: https://github.com/apache/airflow/issues/28010


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1346353631

   Can you post the snippets (anonymosed) of your configuration case and post links (ideally as gists) airflow logs with debugging (you will find the way to do it in the docs) and post logs here showing what's going on ? 
   
   Maybe you've made a typo or misunderstood how to configure it - but showing (or even looking closely by you) the snippets and logs should help in spotting it.
   
   I understand you say 'we've done that's but before anyone attempts to reproduce it we need to see what you've done to be able to reproduce it and help you to diagnose it 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1356383936

   Ok. I think the problem is sentinel.
   
   Simply cerely 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1356393735

   Few more things. Just adding to the above - which might be bad guess. 
   
   The fact that nether webserver nor scheduler fails is that a) webserver does not connect to redis at all b) scheduler will only do it when scheduliing tasks via celery executor - while workers are trying to connect to them as consumer.
   
   I believe your configuration is passed properly - but some configuration of celery/networking/DNS/firewall simply make the attempts to instances of redis fail.
   
   This error is quite clear about it:
   
   ```
   [2022-12-14 14:18:28,125: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@banaan:5672//: [Errno -2] Name or service not known.
   ``` 
   
   Indicaes that client cannot resolve the "banaan" name - which might simply mean that your banaan address cannot be resolved by Airflow worker.. 
   
   Now - I do not know what are your `xxx` in the configurationsm but for me you are facing a problem with networking/DNS, not with the client.
   
   So I guess this issue description was wrong. It's likely NOT about SSL parameters to pass, it's deploymet issue you have most likely @jonathanjuursema
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk closed issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.
URL: https://github.com/apache/airflow/issues/28010


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1436044858

   I hope then someone using sentinel can further debug it and solve.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ccwillia commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
ccwillia commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1371614180

   Was this ever resolved?  I am using a single sentinel URL in airflow 1.10.10 which works.  I have begun migrating to airflow 2.5 and it does not appear to be working


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1332327588

   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1347262359

   Can you dump the env values from the same user that Airlfow worker is run (including making sure that the exact same entrypoint is used ?)  and check if your workers have the same variables and settings mounted as scheduler/webserver?
   
   I guess the problem is that your workers do not have the same variables set or the mounted file is not mounted there or maybe the user airlfow is run with has no permissions. I tink you can track it down via  also enable debug logging for Airlfow (you can find it in the config/docs). Also when I usually debug such issues I modify the config in the way to make absolutely sure it is actually processed - for example raising an Exception with some meaningful message right after the configuration is parsed is a good way to see that it actually is - by various components. 
   
   Raising an exception in your config and seeing it in your logs will make sure you have not made any typo or problem,
   This is what I'd do at least if I had similar issue.
   
   Can you please make such an exercise?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jonathanjuursema commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
jonathanjuursema commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1351541037

   I've spent some time playing with our set-up to tackle some of the question/challenges you set out. I have the following observations:
   
   **Is the configuration the same between the worker, webserver and scheduler?**
   Yes. As mentioned, we deploy Airflow in a containerized setting, and all containers (webserver, scheduler and worker) are all provided environment variables from (mostly) the same central source. To doublecheck, I've ran the following command in all three containers:
   
   ```bash
   printenv | grep AIRFLOW; printenv | grep REDIS; printenv | grep CELERY
   ```
   Sorted and compared the output in Excel (not by eye, but by writing a bunch of _if this cell equals that cell_ statements) and I am 100% sure all containers run the exact same environment config.
   
   **Can you make sure you are actually loading the intended configuration?**
   I did the following. I've updated the `/opt/airflow/config/retail_celery_config.py` I've discussed in my previous comment like this (note the broker URL):
   
   ```python
   from airflow.config_templates.default_celery import DEFAULT_CELERY_CONFIG
   import os
   
   CELERY_CONFIG = {
       **DEFAULT_CELERY_CONFIG,
       'broker_url': 'banana',
       'broker_transport_options': {
           'password': os.getenv('REDIS_BROKER_MASTER_PASSWORD'),
           'master_name': os.getenv('REDIS_BROKER_MASTER_NAME')
       }
   }
   ```
   
   If I deploy this way, I'm observing the following:
   
   The webserver and scheduler don't show anything weird in their logging. Their stdout looks fine, scheduler stderr is empty, and webserver stderr is below. I don't think that is related.
   ```
   /home/airflow/.local/lib/python3.10/site-packages/azure/storage/common/_connection.py:82 SyntaxWarning: "is" with a literal. Did you mean "=="?
   [2022-12-14 14:09:29 +0000] [30] [INFO] Starting gunicorn 20.1.0
   [2022-12-14 14:09:29 +0000] [30] [INFO] Listening at: http://0.0.0.0:8080 (30)
   [2022-12-14 14:09:29 +0000] [30] [INFO] Using worker: sync
   [2022-12-14 14:09:29 +0000] [46] [INFO] Booting worker with pid: 46
   [2022-12-14 14:09:29 +0000] [47] [INFO] Booting worker with pid: 47
   [2022-12-14 14:09:29 +0000] [48] [INFO] Booting worker with pid: 48
   [2022-12-14 14:09:29 +0000] [49] [INFO] Booting worker with pid: 49
   ```
   
   The worker, however, shows the following stdout:
   ```
    -------------- celery@f616d2ff89b0 v5.2.7 (dawn-chorus)
   --- ***** ----- 
   -- ******* ---- Linux-5.18.0-0.deb11.4-amd64-x86_64-with-glibc2.31 2022-12-14 14:08:30
   - *** --- * --- 
   - ** ---------- [config]
   - ** ---------- .> app:         airflow.executors.celery_executor:0x7f29e6748ac0
   - ** ---------- .> transport:   amqp://guest:**@banaan:5672//
   - ** ---------- .> results:     mysql://xxx:**@xxx:3306/xxx
   - *** --- * --- .> concurrency: 16 (prefork)
   -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
   --- ***** ----- 
    -------------- [queues]
                   .> default          exchange=default(direct) key=default
   ```
   
   And the following in stderr:
   ```
   [2022-12-14 14:17:24,067: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@banaan:5672//: [Errno -2] Name or service not known.
   Trying again in 32.00 seconds... (16/100)
   
   [2022-12-14 14:17:56,098: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@banaan:5672//: [Errno -2] Name or service not known.
   Trying again in 32.00 seconds... (16/100)
   
   [2022-12-14 14:18:28,125: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@banaan:5672//: [Errno -2] Name or service not known.
   Trying again in 32.00 seconds... (16/100)
   
   [2022-12-14 14:19:00,160: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@banaan:5672//: [Errno -2] Name or service not known.
   Trying again in 32.00 seconds... (16/100)
   
   [2022-12-14 14:19:32,189: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@banaan:5672//: [Errno -2] Name or service not known.
   Trying again in 32.00 seconds... (16/100)
   
   [2022-12-14 14:20:04,217: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@banaan:5672//: [Errno -2] Name or service not known.
   Trying again in 32.00 seconds... (16/100)
   ```
   
   This suggests to me that _at least the worker_ is picking up the custom config.
   
   **Other observations.**
   
   This makes me wonder, if I set the Redis config to something bogus, how come the webserver and and scheduler don't complain?
   
   In order to investigate this I set `AIRFLOW__WEBSERVER__EXPOSE_CONFIG=true` (`AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG` has already been on since this experiment).
   
   Now I can observe the configuration in the Airflow webinterface. This page has two sections. `/opt/airflow/airflow.cfg` shows the Airflow config file. This  is just the default file. We don't specify this file, so we're using the one that comes the upstream Airflow container.
   
   Under `Running Configuration` we can see the actual running configuration, and here I see something interesting:
   | Section | Key | Value | Source |
   | --- | --- | --- | --- |
   | celery | broker_url | redis://redis:6379/0 | airflow.cfg |
   | celery | celery_config_options | retail_celery_config.CELERY_CONFIG | env var |
   
   It loads our reference custom celery config dict (as discussed earlier) from the env var. However, it also loads the `broker_url` from the `airflow.cfg` config file. Somehow, the worker looks like to use the one from our custom config dict (since the logging clearly shows the test string there). The webserver and scheduler, I think, are falling back to the default broker URL from the `airflow.cfg` (or at least seem to ignore our custom dict). They don't show any connection errors however (I've shared the logs above, the stdout logs don't reference the test string anywhere, nor an indication there's something wrong). According to [the docs](https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html), I'd think that the environment variable file should get priority. I'm not sure why (if `redis://redis:6379/0` does not exist) the webserver and scheduler seem to work fine.
   
   I've also searched in our log aggregator (the container UI is not the best one for investigating logs older than a few minutes) for the test string, and for the string `redis`. The first only shows log lines from the worker container (the ones I shared above), the second one shows the following:
   
   ```
   Date,Host,Service,Container Name,Message
   "2022-12-14T13:49:07.140Z","""vmXXXX""","""airflow""","""airflow-init-5df407da-2388-dc6f-15be-78cec8708021""","[2022-12-14 13:49:07,140] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T13:49:11.403Z","""vmXXXX""","""airflow""","""airflow-init-5df407da-2388-dc6f-15be-78cec8708021""","[2022-12-14 13:49:11,402] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T13:49:37.692Z","""vmXXXX""","""airflow""","""airflow-webserver-5df407da-2388-dc6f-15be-78cec8708021""","[2022-12-14 13:49:37,692] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T13:49:43.940Z","""vmXXXX""","""airflow""","""airflow-webserver-5df407da-2388-dc6f-15be-78cec8708021""","[2022-12-14 13:49:43,940] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T13:49:47.090Z","""vmXXXX""","""airflow""","""airflow-webserver-5df407da-2388-dc6f-15be-78cec8708021""","[2022-12-14 13:49:47,088] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T14:02:36.727Z","""vmXXXX""","""airflow""","""airflow-worker-8860332e-07cf-20dc-19b1-56c4ba462531""","File ""/home/airflow/.local/lib/python3.10/site-packages/redis/client.py"", line 1378, in ping"
   "2022-12-14T14:02:36.727Z","""vmXXXX""","""airflow""","""airflow-worker-8860332e-07cf-20dc-19b1-56c4ba462531""","File ""/home/airflow/.local/lib/python3.10/site-packages/redis/client.py"", line 898, in execute_command"
   "2022-12-14T14:02:36.727Z","""vmXXXX""","""airflow""","""airflow-worker-8860332e-07cf-20dc-19b1-56c4ba462531""","File ""/home/airflow/.local/lib/python3.10/site-packages/redis/connection.py"", line 1192, in get_connection"
   "2022-12-14T14:02:36.727Z","""vmXXXX""","""airflow""","""airflow-worker-8860332e-07cf-20dc-19b1-56c4ba462531""","File ""/home/airflow/.local/lib/python3.10/site-packages/redis/sentinel.py"", line 44, in connect"
   "2022-12-14T14:02:36.727Z","""vmXXXX""","""airflow""","""airflow-worker-8860332e-07cf-20dc-19b1-56c4ba462531""","File ""/home/airflow/.local/lib/python3.10/site-packages/redis/sentinel.py"", line 106, in get_master_address"
   "2022-12-14T14:02:36.727Z","""vmXXXX""","""airflow""","""airflow-worker-8860332e-07cf-20dc-19b1-56c4ba462531""","File ""/home/airflow/.local/lib/python3.10/site-packages/redis/sentinel.py"", line 219, in discover_master"
   "2022-12-14T14:04:13.228Z","""vmXXXX""","""airflow""","""airflow-init-6686130f-cb74-c46b-2bff-a81c723030ea""","[2022-12-14 14:04:13,227] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T14:04:17.405Z","""vmXXXX""","""airflow""","""airflow-init-6686130f-cb74-c46b-2bff-a81c723030ea""","[2022-12-14 14:04:17,405] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T14:04:45.321Z","""vmXXXX""","""airflow""","""airflow-webserver-6686130f-cb74-c46b-2bff-a81c723030ea""","[2022-12-14 14:04:45,320] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T14:04:52.018Z","""vmXXXX""","""airflow""","""airflow-webserver-6686130f-cb74-c46b-2bff-a81c723030ea""","[2022-12-14 14:04:52,018] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T14:04:55.988Z","""vmXXXX""","""airflow""","""airflow-webserver-6686130f-cb74-c46b-2bff-a81c723030ea""","[2022-12-14 14:04:55,987] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T14:08:44.904Z","""vmXXXX""","""airflow""","""airflow-init-b75ff09a-cfc0-dad0-0ece-8d3bdcda9553""","[2022-12-14 14:08:44,904] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T14:08:49.385Z","""vmXXXX""","""airflow""","""airflow-init-b75ff09a-cfc0-dad0-0ece-8d3bdcda9553""","[2022-12-14 14:08:49,384] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T14:09:18.278Z","""vmXXXX""","""airflow""","""airflow-webserver-b75ff09a-cfc0-dad0-0ece-8d3bdcda9553""","[2022-12-14 14:09:18,278] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T14:09:24.433Z","""vmXXXX""","""airflow""","""airflow-webserver-b75ff09a-cfc0-dad0-0ece-8d3bdcda9553""","[2022-12-14 14:09:24,433] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   "2022-12-14T14:09:29.312Z","""vmXXXX""","""airflow""","""airflow-webserver-b75ff09a-cfc0-dad0-0ece-8d3bdcda9553""","[2022-12-14 14:09:29,311] {providers_manager.py:433} DEBUG - Loading EntryPoint(name='provider_info', value='airflow.providers.redis.get_provider_info:get_provider_info', group='apache_airflow_provider') from package apache-airflow-providers-redis"
   
   ```
   
   Looking forward to your observations! Do let me know if there's any more information I can provide. :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jonathanjuursema commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by GitBox <gi...@apache.org>.
jonathanjuursema commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1359360159

   Hey! Thanks for checking back.
   
   I think we're confusing two things here.
   
   The test with the `banana` hostname was in response to the following challenge:
   > Raising an exception in your config and seeing it in your logs will make sure you have not made any typo or problem
   
   I've set the hostname to something non-existent, to validate it does indeed load the configuration. Because the worker throws the "name or service not known" error, it seems to correctly load the (invalid) config. The scheduler and web server however, don't seem to, as they would otherwise also have thrown this or a similar exception.
   
   Just to be sure, I've also validated name resolution with the actual domain names that I'm having the problems with (the internal information redacted with `xxx`). This works as expected:
   ```
   (airflow)host xxx
   xxx has address 10.120.xxx.xxx
   ```
   
   Finally, I'm positive we can rule out firewalling. There are other applications running on the same virtual machines that can access those Redis Sentinal instances just fine. Our Redis firewalls are configured to accept all incoming connections from the virtual machines Airflow (and those other apps) are running on.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dintorf commented on issue #28010: Airflow does not pass through Celery's support for Redis Sentinel over SSL.

Posted by "dintorf (via GitHub)" <gi...@apache.org>.
dintorf commented on issue #28010:
URL: https://github.com/apache/airflow/issues/28010#issuecomment-1460761944

   I have some snippets in [Airflow Issue #28655](https://github.com/apache/airflow/issues/28655#issuecomment-1458723703). This worked for me locally, but I have yet to test it elsewhere. If this works for anyone else, I will submit the example to the documentation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org