You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/08/08 20:31:35 UTC

[GitHub] [airflow] khyurri opened a new pull request #10254: Warning about unsafe migrations

khyurri opened a new pull request #10254:
URL: https://github.com/apache/airflow/pull/10254


   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   This PR resolves issue https://github.com/apache/airflow/issues/10129
   
   Output examples:
   
   ```
   [2020-08-08 23:20:29,394] {db.py:681} ERROR - Automatic migration is not available
   Seems you have non unique conn_id in connection table.
   You have to manage those duplicate connections before upgrading the database.
   Duplicated conn_id: [airflow_db]
   [2020-08-08 23:20:29,394] {db.py:681} ERROR - Automatic migration is not available
   The conn_type column in the connection table must contain content.
   Make sure you don't have null in the conn_type column.
   Null conn_type conn_id: [yandexcloud_default]
   ```
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r537740745



##########
File path: airflow/utils/db.py
##########
@@ -597,6 +597,68 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):

Review comment:
       ```suggestion
   def check_conn_id_duplicates(session=None) -> str:
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-740206647


   [The Workflow run](https://github.com/apache/airflow/actions/runs/406821583) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-699821690


   @turbaszek @olchas @TobKed Can you look at it?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r537744013



##########
File path: airflow/utils/db.py
##########
@@ -597,6 +597,68 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = []
+    try:
+        dups = session.query(Connection, func.count(Connection.conn_id)) \
+                      .group_by(Connection.conn_id) \
+                      .having(func.count(Connection.conn_id) > 1).all()
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'
+
+    return ''
+
+
+def check_conn_type_null(session=None):

Review comment:
       ```suggestion
   def check_conn_type_null(session=None) -> str:
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-714205645


   [The Workflow run](https://github.com/apache/airflow/actions/runs/321244722) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khyurri commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
khyurri commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r499487608



##########
File path: airflow/utils/db.py
##########
@@ -606,6 +606,59 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = session.query(Connection, func.count(Connection.conn_id)) \
+                  .group_by(Connection.conn_id) \
+                  .having(func.count(Connection.conn_id) > 1).all()
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'
+    return ''
+
+
+def check_conn_type_null(session=None):
+    """
+    Check nullable conn_type column in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    n_nulls = session.query(Connection)\
+                     .filter(Connection.conn_type.is_(None)).all()
+    if n_nulls:
+        return f'The conn_type column in the connection ' \
+               f'table must contain content.\n' \
+               f'Make sure you don\'t have null ' \
+               f'in the conn_type column.\n' \
+               f'Null conn_type conn_id: {list(n_nulls)}'
+    return ''
+
+
+@provide_session
+def auto_migrations_available(session=None):
+    """
+    @session: session of the sqlalchemy
+    @return: list[str]
+    """
+    errors_ = []
+    try:
+        for check_fn in (check_conn_id_duplicates, check_conn_type_null):
+            err = check_fn(session)
+            if err:
+                errors_.append(err)
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass

Review comment:
       Fixed. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] olchas commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
olchas commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r496513412



##########
File path: airflow/utils/db.py
##########
@@ -606,6 +606,59 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = session.query(Connection, func.count(Connection.conn_id)) \
+                  .group_by(Connection.conn_id) \
+                  .having(func.count(Connection.conn_id) > 1).all()
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'
+    return ''
+
+
+def check_conn_type_null(session=None):
+    """
+    Check nullable conn_type column in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    n_nulls = session.query(Connection)\
+                     .filter(Connection.conn_type.is_(None)).all()
+    if n_nulls:
+        return f'The conn_type column in the connection ' \
+               f'table must contain content.\n' \
+               f'Make sure you don\'t have null ' \
+               f'in the conn_type column.\n' \
+               f'Null conn_type conn_id: {list(n_nulls)}'
+    return ''
+
+
+@provide_session
+def auto_migrations_available(session=None):
+    """
+    @session: session of the sqlalchemy
+    @return: list[str]
+    """
+    errors_ = []
+    try:
+        for check_fn in (check_conn_id_duplicates, check_conn_type_null):
+            err = check_fn(session)
+            if err:
+                errors_.append(err)
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass

Review comment:
       Maybe you could move catching exceptions directly to `check_conn_id_duplicates` and `check_conn_type_null` and return an empty string, if exception is caught? But I am not sure myself if it is even possible just for `Connection` table to not exist. I would expect that if `Connection` table does not exits, other tables won't exist either.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-682755517


   CI is sad. Can you fix it?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r537744155



##########
File path: airflow/utils/db.py
##########
@@ -597,6 +597,68 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = []
+    try:
+        dups = session.query(Connection, func.count(Connection.conn_id)) \
+                      .group_by(Connection.conn_id) \
+                      .having(func.count(Connection.conn_id) > 1).all()
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'
+
+    return ''
+
+
+def check_conn_type_null(session=None):
+    """
+    Check nullable conn_type column in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    n_nulls = []
+    try:
+        n_nulls = session.query(Connection)\
+                         .filter(Connection.conn_type.is_(None)).all()
+    except (exc.OperationalError, exc.ProgrammingError, exc.InternalError):
+        # fallback if tables hasn't been created yet
+        pass
+
+    if n_nulls:
+        return f'The conn_type column in the connection ' \
+               f'table must contain content.\n' \
+               f'Make sure you don\'t have null ' \
+               f'in the conn_type column.\n' \
+               f'Null conn_type conn_id: {list(n_nulls)}'

Review comment:
       ```suggestion
           return 'The conn_type column in the connection ' \
                  'table must contain content.\n' \
                  'Make sure you don\'t have null ' \
                  'in the conn_type column.\n' \
                  f'Null conn_type conn_id: {list(n_nulls)}'
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-740207043


   [The Workflow run](https://github.com/apache/airflow/actions/runs/406824411) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khyurri commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
khyurri commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-699658199


   @mik-laj 
   Not sure why `CI Build / Core:MySQL5.7, Py3.6 (pull_request)` is failed. 
   Other CI seems good. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r537742052



##########
File path: airflow/utils/db.py
##########
@@ -597,6 +597,68 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = []
+    try:
+        dups = session.query(Connection, func.count(Connection.conn_id)) \
+                      .group_by(Connection.conn_id) \
+                      .having(func.count(Connection.conn_id) > 1).all()
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'
+
+    return ''
+
+
+def check_conn_type_null(session=None):
+    """
+    Check nullable conn_type column in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    n_nulls = []
+    try:
+        n_nulls = session.query(Connection)\
+                         .filter(Connection.conn_type.is_(None)).all()
+    except (exc.OperationalError, exc.ProgrammingError, exc.InternalError):
+        # fallback if tables hasn't been created yet
+        pass
+
+    if n_nulls:
+        return f'The conn_type column in the connection ' \
+               f'table must contain content.\n' \
+               f'Make sure you don\'t have null ' \
+               f'in the conn_type column.\n' \
+               f'Null conn_type conn_id: {list(n_nulls)}'
+    return ''
+
+
+@provide_session
+def auto_migrations_available(session=None):
+    """
+    @session: session of the sqlalchemy
+    @return: list[str]

Review comment:
       ```suggestion
       :session: session of the sqlalchemy
       :rtype: list[str]
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-740208035


   [The Workflow run](https://github.com/apache/airflow/actions/runs/406829787) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] olchas commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
olchas commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r499405653



##########
File path: airflow/utils/db.py
##########
@@ -606,6 +606,59 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = session.query(Connection, func.count(Connection.conn_id)) \
+                  .group_by(Connection.conn_id) \
+                  .having(func.count(Connection.conn_id) > 1).all()
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'
+    return ''
+
+
+def check_conn_type_null(session=None):
+    """
+    Check nullable conn_type column in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    n_nulls = session.query(Connection)\
+                     .filter(Connection.conn_type.is_(None)).all()
+    if n_nulls:
+        return f'The conn_type column in the connection ' \
+               f'table must contain content.\n' \
+               f'Make sure you don\'t have null ' \
+               f'in the conn_type column.\n' \
+               f'Null conn_type conn_id: {list(n_nulls)}'
+    return ''
+
+
+@provide_session
+def auto_migrations_available(session=None):
+    """
+    @session: session of the sqlalchemy
+    @return: list[str]
+    """
+    errors_ = []
+    try:
+        for check_fn in (check_conn_id_duplicates, check_conn_type_null):
+            err = check_fn(session)
+            if err:
+                errors_.append(err)
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass

Review comment:
       Yes, this is what I meant in the first place. I believe this should be a little bit safer :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khyurri commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
khyurri commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-715048665


   @mik-laj 
   `airflow/providers/google/cloud/hooks/cloud_memorystore.py:546 in public method `get_conn`:
           D200: One-line docstring should fit on one line with quotes (found 3)`
   
   Is the latest master green? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r537743661



##########
File path: airflow/utils/db.py
##########
@@ -597,6 +597,68 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = []
+    try:
+        dups = session.query(Connection, func.count(Connection.conn_id)) \
+                      .group_by(Connection.conn_id) \
+                      .having(func.count(Connection.conn_id) > 1).all()
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'
+
+    return ''
+
+
+def check_conn_type_null(session=None):
+    """
+    Check nullable conn_type column in connection table
+    @param session:  session of the sqlalchemy
+    @return: str

Review comment:
       ```suggestion
       :param session:  session of the sqlalchemy
       :rtype: str
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khyurri commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
khyurri commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-717934053


   @mik-laj 
   All checks are green, could this PR be merged? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] olchas commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
olchas commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r496089937



##########
File path: airflow/utils/db.py
##########
@@ -606,6 +606,59 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = session.query(Connection, func.count(Connection.conn_id)) \
+                  .group_by(Connection.conn_id) \
+                  .having(func.count(Connection.conn_id) > 1).all()
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'
+    return ''
+
+
+def check_conn_type_null(session=None):
+    """
+    Check nullable conn_type column in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    n_nulls = session.query(Connection)\
+                     .filter(Connection.conn_type.is_(None)).all()
+    if n_nulls:
+        return f'The conn_type column in the connection ' \
+               f'table must contain content.\n' \
+               f'Make sure you don\'t have null ' \
+               f'in the conn_type column.\n' \
+               f'Null conn_type conn_id: {list(n_nulls)}'
+    return ''
+
+
+@provide_session
+def auto_migrations_available(session=None):
+    """
+    @session: session of the sqlalchemy
+    @return: list[str]
+    """
+    errors_ = []
+    try:
+        for check_fn in (check_conn_id_duplicates, check_conn_type_null):
+            err = check_fn(session)
+            if err:
+                errors_.append(err)
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass

Review comment:
       What do you think about catching exception for every `check_fn` separately? Right now we exit the loop on the first exception encountered and won't execute the remaining functions from the list. Of course, at the moment both `check_conn_id_duplicates` and `check_conn_type_null` will raise the exception in the same scenario (`Connection` table not existing), but if more check functions are added, then it might be important to execute them all.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-715242052


   @khyurri the newest master is green but we were having temporary problems. I did rebase again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r537743375



##########
File path: airflow/utils/db.py
##########
@@ -597,6 +597,68 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = []
+    try:
+        dups = session.query(Connection, func.count(Connection.conn_id)) \
+                      .group_by(Connection.conn_id) \
+                      .having(func.count(Connection.conn_id) > 1).all()
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'

Review comment:
       ```suggestion
           return 'Seems you have non unique conn_id in connection table.\n' \
                  'You have to manage those duplicate connections ' \
                  'before upgrading the database.\n' \
                  f'Duplicated conn_id: {[dup[0] for dup in dups]}'
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khyurri commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
khyurri commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r496130366



##########
File path: airflow/utils/db.py
##########
@@ -606,6 +606,59 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = session.query(Connection, func.count(Connection.conn_id)) \
+                  .group_by(Connection.conn_id) \
+                  .having(func.count(Connection.conn_id) > 1).all()
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'
+    return ''
+
+
+def check_conn_type_null(session=None):
+    """
+    Check nullable conn_type column in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    n_nulls = session.query(Connection)\
+                     .filter(Connection.conn_type.is_(None)).all()
+    if n_nulls:
+        return f'The conn_type column in the connection ' \
+               f'table must contain content.\n' \
+               f'Make sure you don\'t have null ' \
+               f'in the conn_type column.\n' \
+               f'Null conn_type conn_id: {list(n_nulls)}'
+    return ''
+
+
+@provide_session
+def auto_migrations_available(session=None):
+    """
+    @session: session of the sqlalchemy
+    @return: list[str]
+    """
+    errors_ = []
+    try:
+        for check_fn in (check_conn_id_duplicates, check_conn_type_null):
+            err = check_fn(session)
+            if err:
+                errors_.append(err)
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass

Review comment:
       > Of course, at the moment both check_conn_id_duplicates and check_conn_type_null will raise the exception in the same scenario (Connection table not existing), but if more check functions are added, then it might be important to execute them all.
   
   Is it not better to rewrite this part, if `except` will catch excessively? Right now loop cancels, if database is not ready to upgrade. 
   
   Maybe I can catch more directly, only when table `Connection` is not exists? But I have not found proper way to achieve that without creating custom Exception class. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-703516002


   The CI and PROD Docker Images for the build are prepared in a separate "Build Image" workflow,
   that you will not see in the list of checks (you will see "Wait for images" jobs instead).
   
   You can checks the status of those images in [The workflow run](https://github.com/apache/airflow/actions/runs/289113853)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-740207131


   [The Workflow run](https://github.com/apache/airflow/actions/runs/406828315) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil merged pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
kaxil merged pull request #10254:
URL: https://github.com/apache/airflow/pull/10254


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-740109703


   Added suggestions, applied, rebased to Master


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r537743855



##########
File path: airflow/utils/db.py
##########
@@ -597,6 +597,68 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = []
+    try:
+        dups = session.query(Connection, func.count(Connection.conn_id)) \
+                      .group_by(Connection.conn_id) \
+                      .having(func.count(Connection.conn_id) > 1).all()
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'
+
+    return ''
+
+
+def check_conn_type_null(session=None):
+    """
+    Check nullable conn_type column in connection table

Review comment:
       ```suggestion
       Check nullable conn_type column in Connection table
   
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#issuecomment-714199119


   I rebased this chaange to latest master. When it's green, I'll merge the change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khyurri commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
khyurri commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r499124977



##########
File path: airflow/utils/db.py
##########
@@ -606,6 +606,59 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    dups = session.query(Connection, func.count(Connection.conn_id)) \
+                  .group_by(Connection.conn_id) \
+                  .having(func.count(Connection.conn_id) > 1).all()
+    if dups:
+        return f'Seems you have non unique conn_id in connection table.\n' \
+               f'You have to manage those duplicate connections ' \
+               f'before upgrading the database.\n' \
+               f'Duplicated conn_id: {[dup[0] for dup in dups]}'
+    return ''
+
+
+def check_conn_type_null(session=None):
+    """
+    Check nullable conn_type column in connection table
+    @param session:  session of the sqlalchemy
+    @return: str
+    """
+    n_nulls = session.query(Connection)\
+                     .filter(Connection.conn_type.is_(None)).all()
+    if n_nulls:
+        return f'The conn_type column in the connection ' \
+               f'table must contain content.\n' \
+               f'Make sure you don\'t have null ' \
+               f'in the conn_type column.\n' \
+               f'Null conn_type conn_id: {list(n_nulls)}'
+    return ''
+
+
+@provide_session
+def auto_migrations_available(session=None):
+    """
+    @session: session of the sqlalchemy
+    @return: list[str]
+    """
+    errors_ = []
+    try:
+        for check_fn in (check_conn_id_duplicates, check_conn_type_null):
+            err = check_fn(session)
+            if err:
+                errors_.append(err)
+    except (exc.OperationalError, exc.ProgrammingError):
+        # fallback if tables hasn't been created yet
+        pass

Review comment:
       >  I would expect that if Connection table does not exits, other tables won't exist either.
   
   Agree. 
   If `exc.OperationalError, exc.ProgrammingError`, it means, that database in unexpected state (eg empty). No reason to continue check migration safety. 
   
   Other way: 
   The script can except `exc.OperationalError, exc.ProgrammingError` on every item in loop:
   
   ```python
   for check_fn in (check_conn_id_duplicates, check_conn_type_null):
       try:
           err = check_fn(session)
           if err:
               errors_.append(err)
       except (exc.OperationalError, exc.ProgrammingError):
            # fallback if table hasn't been created yet
           pass
   ```
   
   What do you think about it? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #10254: Warning about unsafe migrations

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #10254:
URL: https://github.com/apache/airflow/pull/10254#discussion_r537740577



##########
File path: airflow/utils/db.py
##########
@@ -597,6 +597,68 @@ def check_migrations(timeout):
             log.info('Waiting for migrations... %s second(s)', ticker)
 
 
+def check_conn_id_duplicates(session=None):
+    """
+    Check unique conn_id in connection table
+    @param session:  session of the sqlalchemy
+    @return: str

Review comment:
       ```suggestion
       :param session:  session of the sqlalchemy
       :rtype: str
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org