You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/25 10:56:54 UTC

[GitHub] [airflow] Limess opened a new issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Limess opened a new issue #17828:
URL: https://github.com/apache/airflow/issues/17828


   <!--
   Welcome to Apache Airflow!
   
   Please complete the next sections or the issue will be closed.
   -->
   
   **Apache Airflow version**:
   
   <!-- AIRFLOW VERSION IS MANDATORY -->
   2.1.3
   
   **Deployment**:
   
   <!-- e.g. Virtualenv / VM / Docker-compose / K8S / Helm Chart / Managed Airflow Service -->
   
   <!-- Please include your deployment tools and versions: docker-compose, k8s, helm, etc -->
   
   ECS, docker-compose
   
   **What happened**:
   
   A redis connection URL of the form: 
   
   ```
   redis://<host>:6379/1
   ```
   
   is parsed incorrectly by this pattern: https://github.com/apache/airflow/blob/2.1.3/scripts/in_container/prod/entrypoint_prod.sh#L96 - the host is determined to be `localhost` as there is no `@` character.
   
   Adding an `@` character, e.g. `redis://@redis-analytics-airflow.signal:6379/1` results in correct parsing, but cannot connect to redis as auth fails.
   
   Logs:
   
   ```
   BACKEND=redis
   DB_HOST=localhost
   DB_PORT=6379
   ```
   
   and eventually
   
   ```
   ERROR! Maximum number of retries (20) reached.
   Last check result:
   $ run_nc 'localhost' '6379'
   (UNKNOWN) [127.0.0.1] 6379 (?) : Connection refused
   sent 0, rcvd 0
   ```
   
   **How to reproduce it**:
   
   Run airflow with an official docker image and `  AIRFLOW__CELERY__BROKER_URL: redis://<ANY_REDIS_HOSTNAME>:6379/1` and observe that it parses the host as `localhost`.
   
   **Anything else we need to know**:
   
   This was working without issue in Airflow 2.1.2 - weirdly I cannot see changes to the entrypoint, but we have not changed the URL `AIRFLOW__CELERY__BROKER_URL: redis://redis-analytics-airflow.signal:6379/1`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-905724680


   Hm I somehow have a feeling (do no know yet) that it could be side-effect of this change: which instead of reading the `__CELERY__` env variable, uses `airflow get config`:  https://github.com/apache/airflow/pull/17069)
   
   Do you have maybe some other way of defining the Broker ? Secret Backend ?
   Could you please tell what `airflow config get-value celery broker_url` returns?
   
   In the meantime you can workaround it by setting ``CONNECTION_CHECK_MAX_COUNT`` env variable to 0 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Limess edited a comment on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
Limess edited a comment on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-906240516


   1. We set via environment variable: `AIRFLOW__CELERY__BROKER_URL: redis://redis-analytics-airflow.signal:6379/1`
   2. I get this answer: `'redis://redis-analytics-airflow.signal:6379/1'`
   
   From what I can tell the regex is wrong for host if there's no username or password, e.g
   
   https://regex101.com/r/kEohzC/1
   
   The host is in the second capture group in this instance, but the entrypoint is expecting it in the fourth. The port is being inferred entirely from the `redis://` scheme


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Limess edited a comment on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
Limess edited a comment on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-906240516


   1. We set via environment variable: `AIRFLOW__CELERY__BROKER_URL: redis://redis-analytics-airflow.signal:6379/1`
   2. I get this answer: `'redis://redis-analytics-airflow.signal:6379/1'`
   
   From what I can tell the regex is wrong for host if there's no username or password, e.g
   
   https://regex101.com/r/kEohzC/1
   
   The host is in the second capture group in this instance, but the entrypoint is expecting it in the fourth. The port is being inferred entirely from the `redis://` scheme.
   
   I'd suggest this change:
   
   ```
   ([^:]*):\/\/(?:([^:@]*):?([^@]*)@)?([^\/:]*):?([0-9]*)\/([^\?]*)\??(.*)
   ```
   
   i.e. putting the user/pass part in it's own optional non-capture group - however that's not supported by bash regex as far as I'm aware


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-906326525


   Yeah....
   
   <details>
     <summary>The test script: Click to expand!</summary>
   
   ```
   #!/bin/bash
   
   connection_url="$1"
   
   [[ ${connection_url} =~ ([^:]*)://([^:@]*):?([^@]*)@?([^/:]*):?([0-9]*)/([^\?]*)\??(.*) ]] && \
           detected_backend=${BASH_REMATCH[1]} &&
           detected_user=${BASH_REMATCH[2]} &&
           detected_password=${BASH_REMATCH[3]} &&
           detected_host=${BASH_REMATCH[4]} &&
           detected_port=${BASH_REMATCH[5]} &&
           detected_schem=${BASH_REMATCH[6]} &&
           detected_params=${BASH_REMATCH[7]} &&
        echo "Backend='${detected_backend}'" &&
        echo "User='${detected_user}'" &&
        echo "Password='${detected_password}'" &&
        echo "Host='${detected_host}'" &&
        echo "Pprt='${detected_port}'" &&
        echo "Schema='${detected_schema}'" &&
        echo "Params='${detected_params}'"
   ```
   
   </details>
   
   And the result:
   
   ```
   ./test.bash "redis://redis-analytics-airflow.signal:6379/1"
   Backend='redis'
   User='redis-analytics-airflow.signal'
   Password='6379'
   Host=''
   Pprt=''
   Schema=''
   Params=''
   ```
   
   This is the embodiment of the famous ancient Chinese proverb (wisdom of the ancients):
   
   > If you have a problem, introduce regexp - you will have two problems.
   
   I will fix it without using regexps.
   
   For now I think you will have to leave with the workaround.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-905724680


   Hm I somehow have a feeling (do no know yet) that it could be side-effect of this change: which instead of reading the `__CELERY__` env variable, uses `airflow get-value`:  https://github.com/apache/airflow/pull/17069)
   
   Do you have maybe some other way of defining the Broker ? Secret Backend ?
   Could you please tell what `airflow config get-value celery broker_url` returns?
   
   In the meantime you can workaround it by setting ``CONNECTION_CHECK_MAX_COUNT`` env variable to 0 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Limess commented on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
Limess commented on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-906340195


   Yes the workaround is fine for us - we're using AWS elasticache so don't really need the wait anyway - it just showed up when upgrading.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-905758371


   BTW. The regexp pattern is correct .. It should nicely match the `redis://redis-analytics-airflow.signal:6379/1`. I am just thinking maybe it's about End-of-line returned by `airflow config get-value celery broker_url`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-905758371


   BTW. The regexp pattern is correct .. It should nicely match the `redis://redis-analytics-airflow.signal:6379/1`. I am just thinking maybe it's about End-of-line or " " returned by `airflow config get-value celery broker_url`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-906326525


   Yeah....
   
   <details>
     <summary>The test script: Click to expand!</summary>
   ```
   #!/bin/bash
   
   connection_url="$1"
   
   
   [[ ${connection_url} =~ ([^:]*)://([^:@]*):?([^@]*)@?([^/:]*):?([0-9]*)/([^\?]*)\??(.*) ]] && \
           detected_backend=${BASH_REMATCH[1]} &&
           detected_user=${BASH_REMATCH[2]} &&
           detected_password=${BASH_REMATCH[3]} &&
           detected_host=${BASH_REMATCH[4]} &&
           detected_port=${BASH_REMATCH[5]} &&
           detected_schem=${BASH_REMATCH[6]} &&
           detected_params=${BASH_REMATCH[7]} &&
        echo "Backend='${detected_backend}'" &&
        echo "User='${detected_user}'" &&
        echo "Password='${detected_password}'" &&
        echo "Host='${detected_host}'" &&
        echo "Pprt='${detected_port}'" &&
        echo "Schema='${detected_schema}'" &&
        echo "Params='${detected_params}'"
   ```
   </details>
   
   And the result:
   
   ```
   ./test.bash "redis://redis-analytics-airflow.signal:6379/1"
   Backend='redis'
   User='redis-analytics-airflow.signal'
   Password='6379'
   Host=''
   Pprt=''
   Schema=''
   Params=''
   ```
   
   This is the embodiment of the famous ancient Chinese proverb (wisdom of the ancients):
   
   > If you have a problem, introduce regexp - you will have two problems.
   
   I will fix it without using regexps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-906326525


   Yeah....
   
   <details>
     <summary>The test script: Click to expand!</summary>
   
   ```
   #!/bin/bash
   
   connection_url="$1"
   
   [[ ${connection_url} =~ ([^:]*)://([^:@]*):?([^@]*)@?([^/:]*):?([0-9]*)/([^\?]*)\??(.*) ]] && \
           detected_backend=${BASH_REMATCH[1]} &&
           detected_user=${BASH_REMATCH[2]} &&
           detected_password=${BASH_REMATCH[3]} &&
           detected_host=${BASH_REMATCH[4]} &&
           detected_port=${BASH_REMATCH[5]} &&
           detected_schem=${BASH_REMATCH[6]} &&
           detected_params=${BASH_REMATCH[7]} &&
        echo "Backend='${detected_backend}'" &&
        echo "User='${detected_user}'" &&
        echo "Password='${detected_password}'" &&
        echo "Host='${detected_host}'" &&
        echo "Pprt='${detected_port}'" &&
        echo "Schema='${detected_schema}'" &&
        echo "Params='${detected_params}'"
   ```
   
   </details>
   
   And the result:
   
   ```
   ./test.bash "redis://redis-analytics-airflow.signal:6379/1"
   Backend='redis'
   User='redis-analytics-airflow.signal'
   Password='6379'
   Host=''
   Pprt=''
   Schema=''
   Params=''
   ```
   
   This is the embodiment of the famous ancient Chinese proverb (wisdom of the ancients):
   
   > If you have a problem, introduce regexp - you will have two problems.
   
   I will fix it without using regexps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #17828:
URL: https://github.com/apache/airflow/issues/17828


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-905724680


   Hm I somehow have a feeling (do no know yet) that it could be side-effect of this change: which instead of running the `__CELERY__` env variable, uses `airflow get config`:  https://github.com/apache/airflow/pull/17069)
   
   Do you have maybe some other way of defining the Broker ? Secret Backend ?
   Could you please tell what `airflow config get-value celery broker_url` returns?
   
   In the meantime you can workaround it by setting ``CONNECTION_CHECK_MAX_COUNT`` env variable to 0 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Limess edited a comment on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
Limess edited a comment on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-906240516


   1. We set via environment variable: `AIRFLOW__CELERY__BROKER_URL: redis://redis-analytics-airflow.signal:6379/1`
   2. I get this answer: `'redis://redis-analytics-airflow.signal:6379/1'`
   
   From what I can tell the regex is wrong for host if there's no username or password, e.g
   
   https://regex101.com/r/kEohzC/1
   
   The host is in the second capture group in this instance, but the entrypoint is expecting it in the fourth. The port is being inferred entirely from the `redis://` scheme.
   
   I'd suggest this change:
   
   ```
   ([^:]*):\/\/(?:([^:@]*):?([^@]*)@)?([^\/:]*):?([0-9]*)\/([^\?]*)\??(.*)
   ```
   
   i.e. putting the user/pass part in it's own optional non-capture group - however that's not supported by bash regex as far as I'm aware. I believe it would work to use a normal group and add 1 to all the captured group indices.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-906326525


   Yeah....
   
   <details>
     <summary>The test script: Click to expand!</summary>
   
   ```
   #!/bin/bash
   
   connection_url="$1"
   
   [[ ${connection_url} =~ ([^:]*)://([^:@]*):?([^@]*)@?([^/:]*):?([0-9]*)/([^\?]*)\??(.*) ]] && \
           detected_backend=${BASH_REMATCH[1]} &&
           detected_user=${BASH_REMATCH[2]} &&
           detected_password=${BASH_REMATCH[3]} &&
           detected_host=${BASH_REMATCH[4]} &&
           detected_port=${BASH_REMATCH[5]} &&
           detected_schem=${BASH_REMATCH[6]} &&
           detected_params=${BASH_REMATCH[7]} &&
        echo "Backend='${detected_backend}'" &&
        echo "User='${detected_user}'" &&
        echo "Password='${detected_password}'" &&
        echo "Host='${detected_host}'" &&
        echo "Pprt='${detected_port}'" &&
        echo "Schema='${detected_schema}'" &&
        echo "Params='${detected_params}'"
   ```
   
   </details>
   
   And the result:
   
   ```
   ./test.bash "redis://redis-analytics-airflow.signal:6379/1"
   Backend='redis'
   User='redis-analytics-airflow.signal'
   Password='6379'
   Host=''
   Pprt=''
   Schema=''
   Params=''
   ```
   
   This is the embodiment of the famous ancient Chinese proverb (wisdom of the ancients):
   
   > If you have a problem, introduce regexp - you will have two problems.
   
   I will fix it without using regexps.
   
   For now I think you will have to live with the workaround.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-906345644


   I just wonder WHY it worked before. ??? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-905415030


   Hi, would you be interested in investigating this and coming up with a fix for PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-905761599


   1. How do you set your broker variable
   2. Can you show us result of `echo "'$(airflow config get-value celery broker_url)'"`  - i added ' to see the boundaries of what it prints.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Limess commented on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
Limess commented on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-906240516


   1. We set via environment variable: `AIRFLOW__CELERY__BROKER_URL: redis://redis-analytics-airflow.signal:6379/1`
   2. I get this answer: `'redis://redis-analytics-airflow.signal:6379/1'`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #17828: Docker image entrypoint does not parse redis URLs correctly with no password

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #17828:
URL: https://github.com/apache/airflow/issues/17828#issuecomment-905761599


   1. How do you set your broker variable ?
   2. Can you show us result of `echo "'$(airflow config get-value celery broker_url)'"`  - i added ' to see the boundaries of what it prints.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org