You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/07/14 15:28:46 UTC

[GitHub] [airflow] r723235 opened a new issue, #25063: Airflow DB Clean Fails for `callback_request`

r723235 opened a new issue, #25063:
URL: https://github.com/apache/airflow/issues/25063

   ### Apache Airflow version
   
   2.3.3 (latest released)
   
   ### What happened
   
   Docker: `apache/airflow:2.3.3-python3.9`
   
   DB: AWS RDS Postgres 14
   ```
   airflow=> select version();
                              version
   
   --------------------------------------------------------------
    PostgreSQL 14.1 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 20180
   712 (Red Hat 7.3.1-6), 64-bit
   (1 row)
   
   airflow=> SHOW server_version;
    server_version
   ----------------
    14.1
   (1 row)
   
   airflow=> \! psql -V
   psql (PostgreSQL) 14.4 (Debian 14.4-1.pgdg110+1)
   ```
   
   attempting to execute airflow db clean. however, i come across an error for one specific table:
   ```
   $ airflow db clean --verbose --clean-before-timestamp $(date +"%F") --dry-run -t callback_request
   Performing dry run for db cleanup.
   Data prior to 2022-07-14T00:00:00+00:00 would be purged from tables ['callback_request'] with the following config:
   
   table            | recency_column     | keep_last | keep_last_filters | keep_last_group_by | warn_if_missing
   =================+====================+===========+===================+====================+================
   callback_request | BaseXCom.timestamp | False     | None              | None               | False
   
   
   Performing dry run for table 'callback_request'
   /home/airflow/.local/lib/python3.9/site-packages/airflow/utils/db_cleanup.py:137 SAWarning: SELECT statement has a cartesian product between FROM element(s) "callback_request" and FROM element "xcom".  Apply join condition(s) between each element to resolve.
   Found 0 rows meeting deletion criteria.
   ```
   
   this error does not occur for any other table.
   
   ### What you think should happen instead
   
   Error: `SAWarning: SELECT statement has a cartesian product between FROM element(s) "xcom" and FROM element "callback_request".  Apply join condition(s) between each element to resolve.`
   
   i assume that this error shouldn't happen. either that or an explanation in the verbose output as to why it's occurring.
   
   ### How to reproduce
   
   shell into webserver or scheduler (really any container) and execute the following:
   
   ```
   airflow db clean --verbose --clean-before-timestamp $(date +"%F") --dry-run -t callback_request
   ```
   
   ### Operating System
   
   Debian GNU/Linux 11 (bullseye)
   
   ### Versions of Apache Airflow Providers
   
   ```
   $ pip freeze | grep apache-airflow-providers
   apache-airflow-providers-amazon==4.0.0
   apache-airflow-providers-celery==3.0.0
   apache-airflow-providers-cncf-kubernetes==4.1.0
   apache-airflow-providers-docker==3.0.0
   apache-airflow-providers-elasticsearch==4.0.0
   apache-airflow-providers-ftp==3.0.0
   apache-airflow-providers-google==8.1.0
   apache-airflow-providers-grpc==3.0.0
   apache-airflow-providers-hashicorp==3.0.0
   apache-airflow-providers-http==3.0.0
   apache-airflow-providers-imap==3.0.0
   apache-airflow-providers-microsoft-azure==4.0.0
   apache-airflow-providers-mysql==3.0.0
   apache-airflow-providers-odbc==3.0.0
   apache-airflow-providers-postgres==5.0.0
   apache-airflow-providers-redis==3.0.0
   apache-airflow-providers-sendgrid==3.0.0
   apache-airflow-providers-sftp==3.0.0
   apache-airflow-providers-slack==5.0.0
   apache-airflow-providers-sqlite==3.0.0
   apache-airflow-providers-ssh==3.0.0
   ```
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   ```
   helm repo add apache-airflow https://airflow.apache.org
   airflow_manifest=$(mktemp)
   envsubst < ./airflow.helm.yml > ${airflow_manifest}
   helm upgrade -i -f ${airflow_manifest} airflow apache-airflow/airflow
   ```
   
   ```yaml
   # https://github.com/apache/airflow/blob/main/chart/values.yaml
   # relevant parts of airflow.helm.yml
   
   # Airflow version
   airflowVersion: 2.3.3
   
   executor: KubernetesExecutor
   
   data:
     metadataConnection:
       host: $AIRFLOW_DATABASE_HOST
       port: $AIRFLOW_DATABASE_PORT
       user: $AIRFLOW_DATABASE_USER
       pass: $AIRFLOW_DATABASE_PASSWORD
       db: $AIRFLOW_DATABASE_NAME
       protocol: $AIRFLOW_DATABASE_PROTOCOL
       sslmode: disable
   
   postgresql:
     enabled: false
   
   ```
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #25063: Airflow DB Clean Fails for `callback_request`

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #25063: Airflow DB Clean Fails for `callback_request`
URL: https://github.com/apache/airflow/issues/25063


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #25063: Airflow DB Clean Fails for `callback_request`

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #25063:
URL: https://github.com/apache/airflow/issues/25063#issuecomment-1186108743

   This problem has been already fixed in https://github.com/apache/airflow/pull/23574 . I marked #23574 as a candidate for 2.3.4 release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on issue #25063: Airflow DB Clean Fails for `callback_request`

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #25063:
URL: https://github.com/apache/airflow/issues/25063#issuecomment-1184584990

   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org