You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "mis98zb (via GitHub)" <gi...@apache.org> on 2023/07/27 07:14:30 UTC

[GitHub] [airflow] mis98zb opened a new pull request, #32876: enable db access for dag run listeners

mis98zb opened a new pull request, #32876:
URL: https://github.com/apache/airflow/pull/32876

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mis98zb commented on pull request #32876: enable db access for dag run listeners

Posted by "mis98zb (via GitHub)" <gi...@apache.org>.
mis98zb commented on PR #32876:
URL: https://github.com/apache/airflow/pull/32876#issuecomment-1689203017

   instead of read xcom in listener, read xcom via airflow restful api from backend service.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #32876: enable db access for dag run listeners

Posted by "uranusjr (via GitHub)" <gi...@apache.org>.
uranusjr commented on PR #32876:
URL: https://github.com/apache/airflow/pull/32876#issuecomment-1653043727

   This breaks backward compatibility. Not sure if it is a good idea. What’s the problem with opening a session in the hook instead?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mis98zb commented on pull request #32876: enable db access for dag run listeners

Posted by "mis98zb (via GitHub)" <gi...@apache.org>.
mis98zb commented on PR #32876:
URL: https://github.com/apache/airflow/pull/32876#issuecomment-1689168362

   Yes, you are right.
   There are variable ways to call external APIs.
   
   Previously I used airflow web UI to add connections, which are used in tasks.
   So the connections are stored in airflow db...
   Currently I'm using environment variables for connections.
   Everything working fine, except that I cannot see connections in web UI anymore, :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mis98zb closed pull request #32876: enable db access for dag run listeners

Posted by "mis98zb (via GitHub)" <gi...@apache.org>.
mis98zb closed pull request #32876: enable db access for dag run listeners
URL: https://github.com/apache/airflow/pull/32876


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mis98zb commented on pull request #32876: enable db access for dag run listeners

Posted by "mis98zb (via GitHub)" <gi...@apache.org>.
mis98zb commented on PR #32876:
URL: https://github.com/apache/airflow/pull/32876#issuecomment-1653058599

   And, there are similar exception when I use hook inside the dag_run_listener.
   By investigating, I found that hooks will use `Connection.get_connection_from_secrets(conn_id)` to get connection.
   There are two default secret back ends: EnvironmentVariablesBackend and MetastoreBackend.
   The MetastoreBackend uses `@provide_session` to open a new session, and do commit.
   The exception is raised on committing...
   
   For this, I have to change to use environment variables to set connection.
   But this is not encrypted.
   Do you have suggestions for this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mis98zb commented on pull request #32876: enable db access for dag run listeners

Posted by "mis98zb (via GitHub)" <gi...@apache.org>.
mis98zb commented on PR #32876:
URL: https://github.com/apache/airflow/pull/32876#issuecomment-1678742961

   Thank you for the answer!
   
   But seems this does not work for me.
   The `@provide_session` will call submit() when the function exit, and it's inside the `@provide_session` of scheduler.
   This will cause the `UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS` exception...
   
   Maybe run the listeners inside the scheduler loop is not so good.
   If it is run as a task, that should be great...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #32876: enable db access for dag run listeners

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #32876:
URL: https://github.com/apache/airflow/pull/32876#issuecomment-1688013185

   > Fortunately, there is workaround for this. We can store connections in places other than airflow db. For example, in environment variables...
   
   Or secret backends, or even filesystem secret backend. I wouldn't even call it workaround it's pretty standard approach. 
   
   Also there is no particular reason to store the API credentials as connection in the DB for such API call. It does not even have to use "Connection" mechanism. You can simply make a regular env variable, mounted as secret in K8S for that. The calls to the API  via Airflow Python client are not assuming you are going to use Connection for it. It has never meant to be used with Airflow Connections actually, as the whole idea is to call it from "outside" of Airflow (in this case you will call it from "inside" but this is accidental). So credentials can be passed from any source - Connection seems like unnecessary abstraction to use for it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mis98zb commented on pull request #32876: enable db access for dag run listeners

Posted by "mis98zb (via GitHub)" <gi...@apache.org>.
mis98zb commented on PR #32876:
URL: https://github.com/apache/airflow/pull/32876#issuecomment-1653048077

   I tried with `@provide_session`, it will got exception with HA configuration:
   ```
   UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #32876: enable db access for dag run listeners

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #32876:
URL: https://github.com/apache/airflow/pull/32876#issuecomment-1682774959

   I agree, this is a very bad idea to allow db access here. There are multiple reasons (HA locks is the most important one) why "user provided code" should not mess around the sessions and DB while  dag runs are processed by the scheduler. We heavily rely on those db operations to be performant, have very strictly demarcated session boundaries. Adding a DB code run by user in some crucial moments, might cause a number of subtle problems that might be extremely difficult to diagnose. Basicaly by opening this option you are shooting yourself in the foot (and actually more likely you will be passing the problems on maintainers when some unintended side-effects kick in).
   
   Alos listener by definition should avoid doing any "heavy" work. It just listens, by implementing some database/heavy processing in it, you are likely impacting scheduler's performance a lot. Don't do it.
   
   IMHO trying to run DB operations and reading Xcom here is doubly wrong:
   
   * You should use Airflow REST API to access XCom information
   *  You should not run any heavy workload in your listeners. Your listerners should at most trigger (for example) a lambda  function in AWS to perform such summary.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mis98zb commented on pull request #32876: enable db access for dag run listeners

Posted by "mis98zb (via GitHub)" <gi...@apache.org>.
mis98zb commented on PR #32876:
URL: https://github.com/apache/airflow/pull/32876#issuecomment-1687782795

   Got it.
   I'll change to call restful api in listener. And let the api it call airflow rest api to get error details.
   
   However, there is still a problem:
   To call restful api, airflow need get connections. If the connections are stored in airflow db, the `UNEXPECTED COMMIT` exception will be thrown.
   That is, connections for restful api which used in listener cannot be stored in airflow db.
   Fortunately, there is workaround for this. We can store connections in places other than airflow db. For example, in environment variables...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org