You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/09/24 16:06:06 UTC

[GitHub] [airflow] Mikhail-M opened a new pull request #18506: Support for mariadb >= 10.6

Mikhail-M opened a new pull request #18506:
URL: https://github.com/apache/airflow/pull/18506


   As written in section https://airflow.apache.org/docs/apache-airflow/stable/concepts/scheduler.html#database-requirements, MariaDB higher than 10.6 should work well in HA mode, but has not been tested, so I have tested it and it doesn't work because
   1) MariaDB 10.6 doesn't support  update OF, but support NOWAIT and SKIP LOCKED well, so in current version of AIRFLOW if condition is wrong -- fixed in this PR
   2) SQLAlchemy==1.3.24 knew nothing about MariaDB 10.6 and contains this fragment of code https://github.com/sqlalchemy/sqlalchemy/blob/rel_1_3_24/lib/sqlalchemy/dialects/mysql/base.py#L1609 as a result, we get query  without SKIP LOCKED. It works correct in 1.4.0 and higher but we can't update this dependency because of https://github.com/apache/airflow/issues/14811 (webserver crash). In our infrastructure I tested latest version of SQLAlchemy and Scheduler work well, but webserver not. 
   
   As far as we merged thin PR and update SQLAlchemy Airflow will support HA Schedulers with MariaDB 10.6+
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #18506: Support for mariadb >= 10.6

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-926746209


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, itโ€™s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better ๐Ÿš€.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962263641


   > > This is all I am here about our statement. The "best effort" is just wrong statement IMHO.
   > 
   > Check my very first comment: [#18506 (comment)](https://github.com/apache/airflow/pull/18506#issuecomment-961495961) -- I am against officially supporting it :)
   
   That's cooll. But let's not call it "best effort" - lets call it "no support". 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-926830941


   I think it would be great to get it all in one go - one problem I have with this PR is that it has no tests, so we do not really know if it works (and any future changes might still break it) - adding CI support will not only test it, but prevent any regressions in the future. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962225083


   > We know some of the issues with both MySQL and Mariadb and Cloud Composer shifting to Postgres is also a good example. If the underlying DB has some issues like "deadlocks" which won't be easily tested via tests as they are sporadic, we will have to spend time to investigate and fix those if we officially support it.
   
   Well. I do agree MySQL is bad for us. But IMHO MariaDB will not increase it. The only way we can deal with it is to DROP both MySQL and MariaDB. And I am 100% for it to be honest. But if we are supporting MySQL, adding MariaDB is virtually no effort IMHO.
   
   I would be all hands up to drop MySQL in Airlfow 3. If we agree to that, tben I fully agree adding MariaDB is pointless. But if we don't, NOT adding MariaDB is creating more problems that it solves IMHO


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962265853


   Me too. My opinion is that trying to "soften" the message is much worse that being very straightforward. We should be more curagous sometimes to say "We do not do that" rahter thatn "Well, maybe that we wll do but not entirely". For me - merging that change and removing the tests but bluntly saying "we do not support MariaDB" is FAR better than saying "we do best effort to support MariaDB". We don't. It does not work. some people did some changes to make it work, Maybe ut does for some people , but we do not - as a community support it at all. 
   
   Just it - transparent, open communication and properly setting the expectations of people who are receiving the mesage. By saying 'best effort" we go into the "blurry" zone - where our expectation is that Maria DB does not work, but the user expectation who happens to know Maria DB is that it works. This is a wrong place to be in where expectations are so different on both sides.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Mikhail-M commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
Mikhail-M commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962229295


   > I disagree completely with "no test" as "no effort".
   > We made sure even if the experience was sub par for Scheduler HA, to atleast make sure it works.
   
   In my experience Scheduler HA with MariaDB backend doesn't work now(deadlocks), so I decided to prepare this patch (The details in PR description). 
   Now we are using fork with this patch and it looks like it works fine (Airflow 2.1+, 2.2 + MariaDB Galera Cluster have been tested). 
   
   But overal, it works well, because, as @potiuk mentioned, MariaDB is almost 100% compatible with MySQL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #18506:
URL: https://github.com/apache/airflow/pull/18506


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-934650703


   Hey @Mikhail-M  -> are you still working on this ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961508860


   > I am really really not sure about this. I still think we should "officially" support limited DBs and make sure they work 100% with all combinations
   
   Yeah. That would be ideal.  However I believe we have quite "safe" test approach here.
   
   Happy to discus it further but I think we have quite a room for optimising the tests.
   
   I believe providers are not even supposed to know anything about the underlying DB. They will - at most - use Variables/Connections etc. I am not sure if there is a value on running Google Provider vs. MySQL or Postgres DB.
   I can't easily imagine any case that might casue error from Provider uses that will be actually db-dependent. I'd love to see what those could be?
   
   There might be some very specific exceptions (like some "core extensions") but those can be easily separated. 
   
   I honestly can't remember (but of course I can't remember all cases) where we had a "Provider"-originated failed tests only causing Postges or MySQL or Sqlite failures (except the resources/temporary/intermittent errors). 
   
   And that might even be an interesting step for the future "multi-tenancy" work  possibly - any of the Providers should never touch the DB directly. I think they should only access the Variables/Connections etc. - generally the "internal APIs" exposed by Airflow.  The "no direct access to db" might even be - in the future - a "criteria" to pass tests by a provider - so getting to the point where we can safely say "no direct metadata db access from providers" might be an important goal to achieve for all comunity providers (and it could be tested/enforced by intelligent mocking of SQLAlchemy acess to only allow specific access).
   
   For other, non-provider tests - I am quite certain (but I have not yet looked at it in detail) that we have a number of tests that are not even touching th DB / SQLAlachemy already (even in "core" of airflow). It's very likely we can easily separate some (or even a lot) of them and we could even be 100% sure of their "no DB" access. We could mock the sqlalchemy acces in the way that whenever it happens the test will faill. Such tests simply need not be run with different DBs. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962262830


   >Just to reverse the logic - do you think we also support "Oracle" now as "best effort"?
   
   Any DB that is not in our CI is not "officially supported". MariaDB or MSSQL were solved by at least some of the community members. We did it when doing Scheduler HA, Bas did it for MSSQL even before we had experimental support.
   
   The main point there is "that we do not provide guarantee that it will 100% work"
   
   >I think it's exactly the same situation as MariaDB - it is supported by SqlAlchemy and we have no tests for it.
   
   There is no way we can support all the DBs supported by SqlAlchemy. `SqlAlchemy` is for unifying DB API's but as you can see it still gets difficult for them. 
   
   > Or should we say we don't support at all "Oracle" and "MariaDB". I really wonder what's the difference now :) ?
   
   Yup we don't officially support -- " We provide no guarantee that it will work" . We don't really need to say what DBs we don't support :) We just need to say what DBs we officially support.
   
   >I am not really trying to be mean, just want to try to understand what "best effort" means really..
   
   The only thing I wanted to convey was that we shouldn't officially support yet another DB. "best effort" for me is literally we will accept fixes if someone creates PRs as long as it doesn't break existing support for our officially supported DBs. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962262489


   And let me be very clear on that as well - I am also OK on merging that PR without the tests. But I oppose on calling it "best efort". I stand by my statement that not running tests is "we do not suppot" MariaDb. and whenever someeone reports issue with MariaDB, our answer should be (as today ) - we do not support it, change to MySQL. Postgres or MsSQL. . The answer 'we do best effort' to have Maria DB is simply wrong statement that bears no meaning whatsoever.  From our perspectie either we are OK with our users to run MariaDB or not. 
   
   IMHO We should bluntly say "MariaDB  does not work" if we have no tests. Because we really do not know. We never tested it. We do not even know which versions are supported to which extent and which configurations are supported for encoding and the like. There is no "maybe".  
   
   This is all I am here about our stament. The "best effort" is just wrong statement IMHO.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-926779315


   > I think if we REALLY want to support mariadb, we should simply add it to the matrix of tests - similarly as we did with MsSQL. It will be far simpler than MsSQL because MariaDB is almost 100% compatible with MySQL. Would you maybe @Mikhail-M like to add fulll support for mariadb (and possibly solve any problems when they arise ? I am happy to guide you with that.
   
   Cool/. You can take a look as the inspiration:
   
   * https://github.com/apache/airflow/pull/9973
   
   It was then followed up by a few more as we realized that we need more:
   * https://github.com/apache/airflow/pull/16143
   * https://github.com/apache/airflow/pull/16134
   * https://github.com/apache/airflow/pull/16103
   * https://github.com/apache/airflow/pull/17234
   * https://github.com/apache/airflow/pull/18382
   
   If you foilow what has been added there, we should be good :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-926780720


   Most of the changes in the biggest - first change were to fix various MSSQL incompatibilities, so you will likely not need those.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-955038067


   You're using `INSTALL_MARIA_CLIENT` in some places and `INSTALL_MARIADB_CLIENT` some else and the linter is catching it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Mikhail-M commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
Mikhail-M commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962236172


   > * We do not need to add mariadb clients to image (mysql libs works for it 100%)
   
   Agree, it is true


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961510974


   > > I am really really not sure about this. I still think we should "officially" support limited DBs and make sure they work 100% with all combinations
   > 
   > Yeah. That would be ideal. However I believe we have quite "safe" test approach here.
   > 
   > Happy to discus it further but I think we have quite a room for optimising the tests.
   > 
   > I believe providers are not even supposed to know anything about the underlying DB. They will - at most - use Variables/Connections etc. I am not sure if there is a value on running Google Provider vs. MySQL or Postgres DB. I can't easily imagine any case that might casue error from Provider uses that will be actually db-dependent. I'd love to see what those could be?
   > 
   > There might be some very specific exceptions (like some "core extensions") but those can be easily separated.
   > 
   > I honestly can't remember (but of course I can't remember all cases) where we had a "Provider"-originated failed tests only causing Postges or MySQL or Sqlite failures (except the resources/temporary/intermittent errors).
   > 
   > And that might even be an interesting step for the future "multi-tenancy" work possibly - any of the Providers should never touch the DB directly. I think they should only access the Variables/Connections etc. - generally the "internal APIs" exposed by Airflow. The "no direct access to db" might even be - in the future - a "criteria" to pass tests by a provider - so getting to the point where we can safely say "no direct metadata db access from providers" might be an important goal to achieve for all comunity providers (and it could be tested/enforced).
   > 
   > For other, non-provider tests - I am quite certain (but I have not yet looked at it in detail) that we have a number of tests that are not even touching th DB / SQLAlachemy already (even in "core" of airflow). It's very likely we can easily separate some (or even a lot) of them and we could even be 100% sure of their "no DB" access. We could mock the sqlalchemy acces in the way that whenever it happens the test will faill. Such tests simply need not be run with different DBs.
   
   It's not just tests, it's DB Migrations that needs to be taken care of. We are creating some debt already if lots of "if.. mssql do this" .. if "mariab do Y". And while we already know that there are good number of issues with MySQL and MariaDB, I would not be in favor of officially supporting MariaDB. "best effort" is good tbh.
   
   We could really use that time and resource on new features


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961508860


   > I am really really not sure about this. I still think we should "officially" support limited DBs and make sure they work 100% with all combinations
   
   Yeah. That would be ideal.  However I believe we have quite "safe" test approach here.
   
   Happy to discus it further but I think we have quite a room for optimising the tests.
   
   I believe providers are not even supposed to know anything about the underlying DB. They will - at most - use Variables/Connections etc. I am not sure if there is a value on running Google Provider vs. MySQL or Postgres DB.
   I can't really imagine any case that will be detected by Provider change that will be db-dependent?  There might be some very specific exceptions (like some "core extensions") but those can be easily separated. 
   
   I honestly can't remember (but of course I can't remember all cases) where we had a "Provider"-originated failed tests only causing Postges or MySQL or Sqlite failures (except the resources/temporary/intermittent errors). 
   
   And that might even be an interesting step for the future "multi-tenancy" work  possibly - any of the Providers should never touch the DB directly. I think they should only access the Variables/Connections etc. - generally the "internal APIs" exposed by Airflow.  The "no direct access to db" might even be - in the future - a "criteria" to pass tests by a provider - so getting to the point where we can safely say "no direct metadata db access from providers" might be an important goal to achieve for all comunity providers (and it could be tested/enforced).
   
   For other, non-provider tests - I am quite certain (but I have not yet looked at it in detail) that we have a number of tests that are not even touching th DB / SQLAlachemy already (even in "core" of airflow). It's very likely we can easily separate some (or even a lot) of them and we could even be 100% sure of their "no DB" access. We could mock the sqlalchemy acces in the way that whenever it happens the test will faill. Such tests simply need not be run with different DBs. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962249843


   >We can also add MariaDB to the test suite but say that we don't "officially" support it yet. similarly as wtih all other changes I am hesitant to approve a change that we have no tests that test it. That sounds very inconsistent.
   
   Approve a change with no tests? Since we support Postgres, MySQL and MSSQL and the tests are passing with them, what more do you need? I don't see it is as inconsistent, we have doing that for ages. That is the main difference on what is officially supported vs not.
   
   That's what we did it MSSQL way before it was even experimental. Some code that keeps the existing functionality but fixes it for the other is not inconsistent for sure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962427628


   I am fine with that :) - even more. I will write my thoughts on devlist about mysql :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-975634568


   > Is the MariaDB support decision finalized yet?
   
   I don't think it is - I think if you want to get support, the best way to make the decision is to state your position as someone who wants it in the discussion in the devlist and argue for it. I think I was the only one (from those speaking so far) who thinks that we should add tests and full support. 
   
   For me the current status quo is "we do not support MariaDB". Full stop. Even though we know there are many of our users who use MariaDB, they do it fully on their own - and I am going to mostly close/convert into discussion any issues where MariaDB is mentioned and it will look like "maybe a a database issue" (this is very much consequences of the status quo). At least that's my understanding of it (though I think it woudl be better for our users to support it).
   
   And just to elaborate a little why I take this 'stance':
   
   I believe eiether we say "we support mariadb" - and add tests, or we say "we do not support it" when we have no tests. 
   
   I think there is no middle-ground and "best-effort" when we (as community) do "no effort" to make sure the support is there, and especially no effort to make sure it is not broken in the future (this is what tests are really about). 
   
   We are adding new features and there was at least one case where lack of certain DB feature caused change of direction of the implementation (https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-42%3A+Dynamic+Task+Mapping) - big part of the implementation was changed because MySQL does not support big indexes. And without tests we have no guarantee that we will not hit similar limitations of MariaDB in specific versions, especialy that MariaDB versioning is very different from MySQL and some features are appearing with various versions and different cadence. So currently we are not even able to say which version of MySQl our users could use.
   
   So for me at least status quo is "we have no tests" which is equivalent to "we do not support it". I was not able to gain enough support for "supporting" MariaDB, so without more support and discussion on the devlist, the status quo is "we do not support MariaDB".
   
   But I think if you could chime in from the user perspective and say how much you would like - and why - have support for MariaDB on the devlist, might encourage others to speak and possibly support it or at least revive the discussion. Maybe if we know how many of our users are actually using MariaDB could help (but I think for that we would need to wait for our annual survey - I am definitely planning to add such question for "MySQL" users) - the case would be stronger, but I have not enough data to back my "feeling" that we have a lot of users using MariaDB - possibly even more than MySQL users.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-975635187


   The link to discussion is here: 
   Here is the thread link: https://lists.apache.org/thread/dp78j0ssyhx62008lbtblrc856nbmlfb


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961687297


   > It's not just tests, it's DB Migrations that needs to be taken care of. We are creating some debt already if lots of "if.. mssql do this" .. if "mariab do Y". And while we already know that there are good number of issues with MySQL and MariaDB, I would not be in favor of officially supporting MariaDB. "best effort" is good tbh.
   
   If it woukld take a lot of time - I'd heartily agree, but I also believe MsSQL was an outlier as it soo much different in a number of aspects than what we had before.
   
   MariaDB is extremely close to MySQL. I would expect a very, very small fraction of problems we had with MsSQL for MariaDB. I think we already have a good deal of users who use MariaDB because of it MySQL nearly 100% compatibility. And for many people MySQL == MariaDB. And it also could be done as a "backburner" - learning from MsSQL experience we could add tests for it, mark it as "experimental" (i.e. best effort) - however we'd make it not fail the PRs when testing and see how many problems we have - and then decide if it is worth to be supported officially. We do not have to do it immediately. But making our tests run with MariaDB is the very essence of "best effort" we can imagine. 
   
   Not having tests in our case is not a "best effort" - it's "no effort" really.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962269306


   Yup and in my opinion, we stick with what we know and are 100% sure even in terms of performance and edge cases. In trying to increase the number of supported DBs we increase maintenance and provide a "sub-par" experience for our users.
   
   A wise man once said "Do less but do them well" or something similar to the following image :) 
   
   ![image](https://user-images.githubusercontent.com/8811558/140587553-6dd72f35-30c7-4dce-93a3-8a63554865cd.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962225083


   > We know some of the issues with both MySQL and Mariadb and Cloud Composer shifting to Postgres is also a good example. If the underlying DB has some issues like "deadlocks" which won't be easily tested via tests as they are sporadic, we will have to spend time to investigate and fix those if we officially support it.
   
   Well. I do agree MySQL is bad for us. But IMHO MariaDB will not add to it. The only way we can deal with it is to DROP both MySQL and MariaDB. And I am 100% for it to be honest. But if we are supporting MySQL, adding MariaDB is virtually no effort IMHO.
   
   I would be all hands up to drop MySQL in Airlfow 3. If we agree to that, then I fully agree adding MariaDB is pointless. But if we don't, NOT adding MariaDB is creating more problems that it solves IMHO


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962253985


   > "We can also add MariaDB to the test suite but say that we don't "officially" support it yet" **is inconsistent** -- If we have it in CI we support it or plan to support it. "best effort" is as someone encounters and fixes it they will or we do same as what we did with HA.
   
   Just to reverse the logic - do you think we also support "Oracle" now as "best effort"? 
   
   I think it's exactly the same situation as MariaDB - it is supported by SqlAlchemy and we have no tests for it. 
   
   Or should we say we don't support at all "Oracle" and "MariaDB".  I really wonder what's the difference now :) ?
   
   I am not really trying to be mean, just want to try to understand what "best effort" means really.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Mikhail-M commented on a change in pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
Mikhail-M commented on a change in pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#discussion_r739517675



##########
File path: airflow/utils/sqlalchemy.py
##########
@@ -177,6 +179,14 @@ def process_result_value(self, value, dialect):
         return data
 
 
+def _is_dialect_good_for_row_locks(dialect: DefaultDialect) -> bool:
+    supports_for_update_of = dialect.name != "mysql" or dialect.supports_for_update_of
+    mariadb_10_6 = (
+        isinstance(dialect, MySQLDialect) and dialect._is_mariadb and dialect.server_version_info >= (10, 6)
+    )

Review comment:
       Unfortunately, it looks like there is no better way. 
   
   In our version of sqlalchemy it realized this way --  https://github.com/sqlalchemy/sqlalchemy/blob/rel_1_3_24/lib/sqlalchemy/dialects/mysql/base.py#L2686, so we can inline it, but this is most likely worse
   
   In the modern version of sqlalchemy we can replace it with dialect.is_mariadb (and it is backward compatible)
   https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/dialects/mysql/base.py#L2756
   
   Do you have any suggestions?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961510974


   > > I am really really not sure about this. I still think we should "officially" support limited DBs and make sure they work 100% with all combinations
   > 
   > Yeah. That would be ideal. However I believe we have quite "safe" test approach here.
   > 
   > Happy to discus it further but I think we have quite a room for optimising the tests.
   > 
   > I believe providers are not even supposed to know anything about the underlying DB. They will - at most - use Variables/Connections etc. I am not sure if there is a value on running Google Provider vs. MySQL or Postgres DB. I can't easily imagine any case that might casue error from Provider uses that will be actually db-dependent. I'd love to see what those could be?
   > 
   > There might be some very specific exceptions (like some "core extensions") but those can be easily separated.
   > 
   > I honestly can't remember (but of course I can't remember all cases) where we had a "Provider"-originated failed tests only causing Postges or MySQL or Sqlite failures (except the resources/temporary/intermittent errors).
   > 
   > And that might even be an interesting step for the future "multi-tenancy" work possibly - any of the Providers should never touch the DB directly. I think they should only access the Variables/Connections etc. - generally the "internal APIs" exposed by Airflow. The "no direct access to db" might even be - in the future - a "criteria" to pass tests by a provider - so getting to the point where we can safely say "no direct metadata db access from providers" might be an important goal to achieve for all comunity providers (and it could be tested/enforced).
   > 
   > For other, non-provider tests - I am quite certain (but I have not yet looked at it in detail) that we have a number of tests that are not even touching th DB / SQLAlachemy already (even in "core" of airflow). It's very likely we can easily separate some (or even a lot) of them and we could even be 100% sure of their "no DB" access. We could mock the sqlalchemy acces in the way that whenever it happens the test will faill. Such tests simply need not be run with different DBs.
   
   It's not just tests, it's DB Migrations that needs to be taken care of. We are creating some debt already if lots of "if.. mssql do this" .. if "mariab do Y". And while we already know that there are good number of issues with MySQL and MariaDB, I would not be in favor of officially supporting MariaDB. "best effort" is good tbh.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-926758711


   I think if we REALLY want to support mariadb, we should simply add it to the matrix of tests - similarly as we did with MsSQL. It will be far simpler than MsSQL because MariaDB is almost 100% compatible with MySQL. Would you maybe @Mikhail-M like to add fulll support for mariadb  (and possibly solve any problems when they arise ? I am happy to guide you with that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Mikhail-M commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
Mikhail-M commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-926772317


   @potiuk Yes, I am ready.
   I REALLY want to support MariaDB, because we use it currently. If you provide to me some information for next step it will be beautiful. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961162163


   > I'm not sure how I feel about adding another DB to the test matrix.
   > 
   > I think I'm leaning towards leaving MariaDB support as "best effort" but not actively tested.
   
   I'd take slightly different approach. I think what we could do actually, is to make another (long overdue) refactor of our test suites to only run part of the tests for all databases. For example almost all provider tests do not neeed to be run for all databases, similarly we could refactor/identify/mark some of the modules/packages in core that do not require multi-db testing as they won't need to access the db at all.  That will require a bit more automation (and for example some way of verification that potentially new DB-specific tests are not added to some excluded packages).
   
   I want to wait with that however until we rewrite the Breeze / CI to Python (which we plan to do with the Outreachy interns) and then we can likely think about implementing more elaborated logic in the selective-tests part especially. Then Adding a new DB will be far less of a pain.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961508860






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962269306


   Yup and my opinion are we stick with what we know and are 100% sure even in terms of performance and edge cases. In trying to increase the number of supported DBs we increase maintenance and provide a "sub-par" experience for our users.
   
   A wise man once said "Do less but do them well" or something similar to the following image :) 
   
   ![image](https://user-images.githubusercontent.com/8811558/140587553-6dd72f35-30c7-4dce-93a3-8a63554865cd.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962430492


   Here is the thread link: https://lists.apache.org/thread/dp78j0ssyhx62008lbtblrc856nbmlfb


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-975634568


   > Is the MariaDB support decision finalized yet?
   
   I don't think it is - I think if you want to get support, the best way to help with the decision is to state your position as someone who wants it in the discussion in the devlist and argue for it. I think I was the only one (from those speaking so far) who thinks that we should add tests and full support. 
   
   For me the current status quo is "we do not support MariaDB". Full stop. Even though we know there are many of our users who use MariaDB, they do it fully on their own - and I am going to mostly close/convert into discussion any issues where MariaDB is mentioned and it will look like "maybe a a database issue" (this is very much consequences of the status quo). At least that's my understanding of it (though I think it woudl be better for our users to support it).
   
   And just to elaborate a little why I take this 'stance':
   
   I believe eiether we say "we support mariadb" - and add tests, or we say "we do not support it" when we have no tests. 
   
   I think there is no middle-ground and "best-effort" when we (as community) do "no effort" to make sure the support is there, and especially no effort to make sure it is not broken in the future (this is what tests are really about). 
   
   We are adding new features and there was at least one case where lack of certain DB feature caused change of direction of the implementation (https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-42%3A+Dynamic+Task+Mapping) - big part of the implementation was changed because MySQL does not support big indexes. And without tests we have no guarantee that we will not hit similar limitations of MariaDB in specific versions, especialy that MariaDB versioning is very different from MySQL and some features are appearing with various versions and different cadence. So currently we are not even able to say which version of MySQl our users could use.
   
   So for me at least status quo is "we have no tests" which is equivalent to "we do not support it". I was not able to gain enough support for "supporting" MariaDB, so without more support and discussion on the devlist, the status quo is "we do not support MariaDB".
   
   But I think if you could chime in from the user perspective and say how much you would like - and why - have support for MariaDB on the devlist, might encourage others to speak and possibly support it or at least revive the discussion. Maybe if we know how many of our users are actually using MariaDB could help (but I think for that we would need to wait for our annual survey - I am definitely planning to add such question for "MySQL" users) - the case would be stronger, but I have not enough data to back my "feeling" that we have a lot of users using MariaDB - possibly even more than MySQL users.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-975634568


   > Is the MariaDB support decision finalized yet?
   
   I don't think it is - I think if you want to get support, the best way to help with the decision is to state your position as someone who wants it in the discussion in the devlist and argue for it. I think I was the only one (from those speaking so far) who thinks that we should add tests and full support. 
   
   For me the current status quo is "we do not support MariaDB". Full stop. Even though we know there are many of our users who use MariaDB, they do it fully on their own - and I am going to mostly close/convert into discussion any issues where MariaDB is mentioned and it will look like "maybe a a database issue" (this is very much consequences of the status quo). At least that's my understanding of it (though I think it woudl be better for our users to support it).
   
   And just to elaborate a little why I take this 'stance':
   
   I believe eiether we say "we support MariaDB" - and add tests, or we say "we do not support it" when we have no tests. 
   
   I think there is no middle-ground and "best-effort" when we (as community) do "no effort" to make sure the support is there, and especially no effort to make sure it is not broken in the future (this is what tests are really about). 
   
   We are adding new features and there was at least one case where lack of certain DB feature caused change of direction of the implementation (https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-42%3A+Dynamic+Task+Mapping) - big part of the implementation was changed because MySQL does not support big indexes. And without tests we have no guarantee that we will not hit similar limitations of MariaDB in specific versions, especialy that MariaDB versioning is very different from MySQL and some features are appearing with various versions and different cadence. So currently we are not even able to say which version of MySQl our users could use.
   
   So for me at least status quo is "we have no tests" which is equivalent to "we do not support it". I was not able to gain enough support for "supporting" MariaDB, so without more support and discussion on the devlist, the status quo is "we do not support MariaDB".
   
   But I think if you could chime in from the user perspective and say how much you would like - and why - have support for MariaDB on the devlist, might encourage others to speak and possibly support it or at least revive the discussion. Maybe if we know how many of our users are actually using MariaDB could help (but I think for that we would need to wait for our annual survey - I am definitely planning to add such question for "MySQL" users) - the case would be stronger, but I have not enough data to back my "feeling" that we have a lot of users using MariaDB - possibly even more than MySQL users.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962262489


   And let me be very clear on that as well - I am also OK on merging that PR without the tests. But I oppose on calling it "best efort". I stand by my statement that not running tests is "we do not suppot" MariaDb. and whenever someeone reports issue with MariaDB, our answer shoudl be (as today ) - we do not support it, change to MySQL. Postgres or MsSQL. . The answer 'we do best effort' to have Maria DB is simply wrong statement that bears no meaning whatsoever.  From our perspectie either we are OK with our users to run MariaDB or not. 
   
   IMHO We should bluntly say "MariaDB  does not work" if we have no tests. Because we really do not know. We never tested it. We do not even know which versions are supported to which extent and which configurations are supported for encoding and the like. There is no "maybe".  
   
   This is all I am here about our stament. The "best effort" is just wrong statement IMHO.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961508860


   > I am really really not sure about this. I still think we should "officially" support limited DBs and make sure they work 100% with all combinations
   
   Yeah. That would be ideal.  However I believe we have quite "safe" test approach here.
   
   Happy to discus it further but I think we have quite a room for optimising the tests.
   
   I believe providers are not even supposed to know anything about the underlying DB. They will - at most - use Variables/Connections etc. I am not sure if there is a value on running Google Provider vs. MySQL or Postgres DB.
   I can't easily imagine any case that might casue error from Provider uses that will be actually db-dependent. I'd love to see what those could be?
   
   There might be some very specific exceptions (like some "core extensions") but those can be easily separated. 
   
   I honestly can't remember (but of course I can't remember all cases) where we had a "Provider"-originated failed tests only causing Postges or MySQL or Sqlite failures (except the resources/temporary/intermittent errors). 
   
   And that might even be an interesting step for the future "multi-tenancy" work  possibly - any of the Providers should never touch the DB directly. I think they should only access the Variables/Connections etc. - generally the "internal APIs" exposed by Airflow.  The "no direct access to db" might even be - in the future - a "criteria" to pass tests by a provider - so getting to the point where we can safely say "no direct metadata db access from providers" might be an important goal to achieve for all comunity providers (and it could be tested/enforced by intelligent mocking of SQLAlchemy acess to only allow specific access).
   
   For other, non-provider tests - I am quite certain (but I have not yet looked at it in detail) that we have a number of tests that are not even touching th DB / SQLAlachemy already (even in "core" of airflow). It's very likely we can easily separate some (or even a lot) of them and we could even be 100% sure of their "no DB" access. We could mock the sqlalchemy acces in the way that whenever it happens the test will faill. Such tests simply need not to be run with different DBs. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961508860


   > I am really really not sure about this. I still think we should "officially" support limited DBs and make sure they work 100% with all combinations
   
   Yeah. That would be ideal.  However I believe we have quite "safe" test approach here.
   
   Happy to discus it further but I think we have quite a room for optimising the tests.
   
   I believe providers are not even supposed to know anything about the underlying DB. They will - at most - use Variables/Connections etc. I am not sure if there is a value on running Google Provider vs. MySQL or Postgres DB.
   I can't easily imagine any case that might casue error from Provider uses that will be actually db-dependent. I'd love to see what those could be?
   
   There might be some very specific exceptions (like some "core extensions") but those can be easily separated. 
   
   I honestly can't remember (but of course I can't remember all cases) where we had a "Provider"-originated failed tests only causing Postges or MySQL or Sqlite failures (except the resources/temporary/intermittent errors). 
   
   And that might even be an interesting step for the future "multi-tenancy" work  possibly - any of the Providers should never touch the DB directly. I think they should only access the Variables/Connections etc. - generally the "internal APIs" exposed by Airflow.  The "no direct access to db" might even be - in the future - a "criteria" to pass tests by a provider - so getting to the point where we can safely say "no direct metadata db access from providers" might be an important goal to achieve for all comunity providers (and it could be tested/enforced).
   
   For other, non-provider tests - I am quite certain (but I have not yet looked at it in detail) that we have a number of tests that are not even touching th DB / SQLAlachemy already (even in "core" of airflow). It's very likely we can easily separate some (or even a lot) of them and we could even be 100% sure of their "no DB" access. We could mock the sqlalchemy acces in the way that whenever it happens the test will faill. Such tests simply need not be run with different DBs. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961508860






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961495961






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961508860






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#discussion_r716144336



##########
File path: airflow/utils/sqlalchemy.py
##########
@@ -177,6 +179,14 @@ def process_result_value(self, value, dialect):
         return data
 
 
+def _is_dialect_good_for_row_locks(dialect: DefaultDialect) -> bool:
+    supports_for_update_of = dialect.name != "mysql" or dialect.supports_for_update_of
+    mariadb_10_6 = (
+        isinstance(dialect, MySQLDialect) and dialect._is_mariadb and dialect.server_version_info >= (10, 6)
+    )

Review comment:
       Emm not sure depending on `_is_mariadb` is a good idea.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962251301


   You might have missed my reply in https://github.com/apache/airflow/pull/18506#issuecomment-962236269
   
   "We can also add MariaDB to the test suite but say that we don't "officially" support it yet" **is inconsistent** -- If we have it in CI we support it or plan to support it. "best effort" is as someone encounters and fixes it they will or we do same as what we did with HA.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962255088


   It's really about the "statement" and how we explain to our users when they ask. I have no good answer to that now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961510974






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962225083


   > We know some of the issues with both MySQL and Mariadb and Cloud Composer shifting to Postgres is also a good example. If the underlying DB has some issues like "deadlocks" which won't be easily tested via tests as they are sporadic, we will have to spend time to investigate and fix those if we officially support it.
   
   Well. I do agree MySQL is bad for us. But IMHO MariaDB will not add to it. The only way we can deal with it is to DROP both MySQL and MariaDB. And I am 100% for it to be honest. But if we are supporting MySQL, adding MariaDB is virtually no effort IMHO.
   
   I would be all hands up to drop MySQL in Airlfow 3. If we agree to that, tben I fully agree adding MariaDB is pointless. But if we don't, NOT adding MariaDB is creating more problems that it solves IMHO


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961508860






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961510974






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962262489


   And let me be very clear on that as well - I am also OK on merging that PR without the tests. But I oppose on calling it "best effort". I stand by my statement that not running tests is "we do not suppot" MariaDb. and whenever someeone reports issue with MariaDB, our answer should be (as today ) - we do not support it, change to MySQL. Postgres or MsSQL. . The answer 'we do best effort' to have Maria DB is simply wrong statement that bears no meaning whatsoever.  From our perspectie either we are OK with our users to run MariaDB or not. 
   
   IMHO We should bluntly say "MariaDB  does not work" if we have no tests. Because we really do not know. We never tested it. We do not even know which versions are supported to which extent and which configurations are supported for encoding and the like. There is no "maybe".  
   
   This is all I am here about our statement. The "best effort" is just wrong statement IMHO.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962236269


   >Otherwise MariaDB is just a flavour of MySQL that people use and and by not testing it we wil have more issues to handle on a daily basis:
   
   I would rather argue that if MariaDB is like everyone knows a flavor of MySQL there is no need of adding more tests.
   
   
   >If we think MySQL is bad for us - let's take a "curagous' step and drop it (knowing the consequences).
   
   I am all for it but since we had "officially" supported it we can't hang the current users to dry. This is why "officially" supporting means a lot and I am opposed to saying we officially support MariaDB.
   
   >People are already using MariaDB seeing "mysql" is supported
   However similar, they are not the same and this is why both of them exists in the truth though. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962237917


   To be very clear I am ok to have the "code" fix that you have in a separate PR but I am not OK with adding MariaDB to the CI and saying we officially support it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Mikhail-M commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
Mikhail-M commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-954842527


   > Hey @Mikhail-M -> are you still working on this ?
   
   Yes, I am. 
   Sorry, it was a busy days.  But today I had a chance to do something. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Mikhail-M commented on a change in pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
Mikhail-M commented on a change in pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#discussion_r739517675



##########
File path: airflow/utils/sqlalchemy.py
##########
@@ -177,6 +179,14 @@ def process_result_value(self, value, dialect):
         return data
 
 
+def _is_dialect_good_for_row_locks(dialect: DefaultDialect) -> bool:
+    supports_for_update_of = dialect.name != "mysql" or dialect.supports_for_update_of
+    mariadb_10_6 = (
+        isinstance(dialect, MySQLDialect) and dialect._is_mariadb and dialect.server_version_info >= (10, 6)
+    )

Review comment:
       Unfortunately, it looks like there is no better way. 
   
   In our version of sqlalchemy it realized this way --  https://github.com/sqlalchemy/sqlalchemy/blob/rel_1_3_24/lib/sqlalchemy/dialects/mysql/base.py#L2686, so we can inline it, but this is most likely worse
   
   In the modern version of sqlalchemy we can replace it with dialect.is_mariadb (and it is backward compatible)
   https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/dialects/mysql/base.py#L2756
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-1007031333


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962236269


   >Otherwise MariaDB is just a flavour of MySQL that people use and and by not testing it we wil have more issues to handle on a daily basis:
   
   I would rather argue that if MariaDB is like everyone knows a flavor of MySQL there is no need of adding more tests.
   
   
   >If we think MySQL is bad for us - let's take a "curagous' step and drop it (knowing the consequences).
   
   I am all for it but since we had "officially" supported it we can't hang the current users to dry. This is why "officially" supporting means a lot and I am opposed to saying we officially support MariaDB.
   
   >People are already using MariaDB seeing "mysql" is supported
   
   However similar, they are not the same and this is why both of them exists in the truth though. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961495961


   I am really really not sure about this. I still think we should "officially" support limited DBs and make sure they work 100% with all combinations


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962253985


   > "We can also add MariaDB to the test suite but say that we don't "officially" support it yet" **is inconsistent** -- If we have it in CI we support it or plan to support it. "best effort" is as someone encounters and fixes it they will or we do same as what we did with HA.
   
   Just to reverse the logic - do you think we also support "Oracle" now as "best effort"? 
   
   I think it's exactly the same situation as MariaDB - it is supported by SqlAlchemy and we have no tests for it. 
   
   Or should we say we don't support at all "Oracle" and "MariaDB".  I really wonder what's the difference now :) ?
   
   I am not really trying to be mean, just want to try to understand what "best effort" means really..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962251301


   You might have missed my reply in https://github.com/apache/airflow/pull/18506#issuecomment-962236269
   
   "We can also add MariaDB to the test suite but say that we don't "officially" support it yet" is inconsistent -- If we have it in CI we support it or plan to support it. "best effort" is as someone encounters and fixes it they will or we do same as what we did with HA.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961743999


   I disagree completely with "no test" as "no effort". 
   
   We made sure even if the experience was sub par for Scheduler HA, to atleast make sure it works.
   
   Adding to CI is a big NO - CI already fails in certain cases where the frustration would grow for users to see the red cross for random failures that are DB related or something new.
   
   We will need to main the CI DockerImage for any mariadb libraries that it is upto date too. There are no huge gains for the maintenance effort. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962265853


   Me too. My opinion is that trying to "soften" the message is much worse that being very straightforward. We should be more curagous sometimes to say "We do not do that" rahter thatn "Well, maybe that we wll do but not entirely". For me - merging that change and removing the tests but bluntly saying "we do not support MariaDB" is FAR better than saying "we do best effort to support MariaDB". We don't. It does not work. some people did some changes to make it work, Maybe it does for some people , but we do not - as a community - support it at all. 
   
   Just it - transparent, open communication and properly setting the expectations of people who are receiving the mesage. By saying 'best effort" we go into the "blurry" zone - where our expectation is that Maria DB does not work, but the user expectation who happens to know Maria DB is that it works. This is a wrong place to be in where expectations are so different on both sides.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-926780720


   (Most of the changes in the biggest - first change) were to fix various MSSQL incompatibilities, so you will likely not need those.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Mikhail-M commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
Mikhail-M commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-926785575


   Thank you! 
   I think it should be better to create another PR for production-level MariaDB support. What with current one?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] angelorobo commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
angelorobo commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-975283334


   Is the MariaDB support decision finalized yet?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961743999


   I disagree completely with "no test" as "no effort". 
   
   We made sure even if the experience was sub par for Scheduler HA, to atleast make sure it works.
   
   Adding to CI is a big NO - CI already fails in certain cases where the frustration would grow for users to see the red cross for random failures that are DB related or something new.
   
   We will need to main the CI DockerImage for any mariadb libraries that it is upto date too. There are no huge gains for the maintenance effort.
   
   We should really spend the time to building new features in Core Airflow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961743999


   I disagree completely with "no test" as "no effort". 
   
   We made sure even if the experience was sub par for Scheduler HA, to atleast make sure it works.
   
   Adding to CI is a big NO - CI already fails in certain cases where the frustration would grow for users to see the red cross for random failures that are DB related or something new.
   
   We will need to main the CI DockerImage for any mariadb libraries that it is upto date too. There are no huge gains for the maintenance effort.
   
   We should really spend the time to building new features in Core Airflow.
   
   We know some of the issues with both MySQL and Mariadb and Cloud Composer shifting to Postgres is also a good example. If the underlying DB has some issues like "deadlocks" which won't be easily tested via tests as they are sporadic, we will have to spend time to investigate and fix those if we officially support it.
   
   We should really reduce the surface area over here for the things that we officially support.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962262489


   And let me be very clear on that as well - I am also OK on merging that PR without the tests. But I oppose on calling it "best efort". I stand by my statement that not running tests is "we do not suppot" MariaDb. and whenever someeone reports issue with MariaDB, our answer should be (as today ) - we do not support it, change to MySQL. Postgres or MsSQL. . The answer 'we do best effort' to have Maria DB is simply wrong statement that bears no meaning whatsoever.  From our perspectie either we are OK with our users to run MariaDB or not. 
   
   IMHO We should bluntly say "MariaDB  does not work" if we have no tests. Because we really do not know. We never tested it. We do not even know which versions are supported to which extent and which configurations are supported for encoding and the like. There is no "maybe".  
   
   This is all I am here about our statement. The "best effort" is just wrong statement IMHO.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962263291


   >This is all I am here about our statement. The "best effort" is just wrong statement IMHO.
   
   Check my very first comment:
   https://github.com/apache/airflow/pull/18506#issuecomment-961495961 -- I am against officially supporting it :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962263892


   Yup happy with it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Mikhail-M edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
Mikhail-M edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962229295


   > I disagree completely with "no test" as "no effort".
   > We made sure even if the experience was sub par for Scheduler HA, to atleast make sure it works.
   
   In my experience Scheduler HA with MariaDB backend doesn't work now(deadlocks), so I decided to prepare this patch (The details in the PR description). 
   Now we are using fork with this patch and it looks like it works fine (Airflow 2.1+, 2.2 + MariaDB Galera Cluster have been tested). 
   
   But overal, it works well, because, as @potiuk mentioned, MariaDB is almost 100% compatible with MySQL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962233429


   For me it's a bit of "burying head in sand" (not sure of the idiomatic English). 
   
   If we think MySQL is bad for us - let's take a "curagous' step and drop it (knowing the consequences).
    
   Otherwise MariaDB is just a flavour of MySQL that people use and and by not testing it we wil have more issues to handle on a daily basis:
   
   *  We do not need to add mariadb clients to image (mysql libs works for it 100%)
   *  People are already using MariaDB seeing "mysql" is supported
   
   This is very different than MsSQL or other DBs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962240412


   > To be very clear I am ok to have the "code" fix that you have in a separate PR but I am not OK with adding MariaDB to the CI and saying we officially support it
   
   We can also add MariaDB to the test suite but say that we don't "officially" support it yet. similarly as wtih all other changes I am hesitant to approve a change that we have no tests that test it. That sounds very inconsistent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962249843


   >We can also add MariaDB to the test suite but say that we don't "officially" support it yet. similarly as wtih all other changes I am hesitant to approve a change that we have no tests that test it. That sounds very inconsistent.
   
   Approva a change with no tests? Since we support Postgres, MySQL and MSSQL and the tests are passing with them, what more do you need? I don't see it is as inconsistent, we have doing that for ages. That is the main difference on what is officially supported vs not


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-962249843


   >We can also add MariaDB to the test suite but say that we don't "officially" support it yet. similarly as wtih all other changes I am hesitant to approve a change that we have no tests that test it. That sounds very inconsistent.
   
   Approve a change with no tests? Since we support Postgres, MySQL and MSSQL and the tests are passing with them, what more do you need? I don't see it is as inconsistent, we have doing that for ages. That is the main difference on what is officially supported vs not


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-961162163


   > I'm not sure how I feel about adding another DB to the test matrix.
   > 
   > I think I'm leaning towards leaving MariaDB support as "best effort" but not actively tested.
   
   I'd take slightly different approach. I think what we could do actually, is to make another (long overdue) refactor of our test suites to only run part of the tests for all databases. For example almost all provider tests do not neeed to be run for all databases, similarly we could refactor/identify/mark some of the modules/packages in core that do not require multi-db testing as they won't need to access the db at all.  That will require a bit more automation (and for example some way of verification that potentially new DB-specific tests are added to some excluded packages).
   
   I want to wait with that however until we rewrite the Breeze / CI to Python (which we plan to do with the Outreachy interns) and then we can likely think about implementing more elaborated logic in the selective-tests part especially. Then Adding a new DB will be far less of a pain.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Mikhail-M edited a comment on pull request #18506: Support for MariaDB >= 10.6

Posted by GitBox <gi...@apache.org>.
Mikhail-M edited a comment on pull request #18506:
URL: https://github.com/apache/airflow/pull/18506#issuecomment-926785575


   Thank you! 
   I think it should be better to create another PR for production-level MariaDB support. What should we do with current one?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org