You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/06/29 15:40:04 UTC

[GitHub] [airflow] Hasan-J opened a new pull request #9067: Make conn_id unique in Connections table

Hasan-J opened a new pull request #9067:
URL: https://github.com/apache/airflow/pull/9067


   Closes #8608 
   
   Added an alembic revision, if the user already has multiple records with the same `conn_id`
   in the connections table, it throws an exception during `airflow db upgrade` with a clear
   message.
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-667656018


   Thanks @Hasan-J πŸŽ‰ 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-665664474


   @Hasan-J I accepted and for some reason I cannot merge this change. Additionally, I have a small request: when you finish working on your change, please write a comment so that we will receive notifications about your activity.  If you do not make a comment, we do not see the progress of the work and we must remember to come back to this change.
    


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Hasan-J commented on pull request #9067: [AIRFLOW-8608] Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
Hasan-J commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-636969504


   I have removed the test method `test_connection_metastore_secrets_backend` in `tests/secrets/test_secrets_backends.py` based on the fact that it's using mutiple connections with the same `conn_id`. Maybe someone can confirm that this indeed should be removed !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] xinbinhuang commented on a change in pull request #9067: [AIRFLOW-8608] Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
xinbinhuang commented on a change in pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#discussion_r432726322



##########
File path: airflow/migrations/versions/8d48763f6d53_add_unique_constraint_to_conn_id.py
##########
@@ -0,0 +1,76 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""add unique constraint to conn_id
+
+Revision ID: 8d48763f6d53
+Revises: 952da73b5eff
+Create Date: 2020-05-03 16:55:01.834231
+
+"""
+
+import sqlalchemy as sa
+from alembic import op
+
+# revision identifiers, used by Alembic.
+revision = '8d48763f6d53'
+down_revision = '952da73b5eff'
+branch_labels = None
+depends_on = None
+
+
+def upgrade():
+    """Apply add unique constraint to conn_id"""
+    conn = op.get_bind()
+    try:
+        if conn.dialect.name in ["mysql", "postgresql", "mssql"]:
+            op.create_unique_constraint(
+                constraint_name="unique_conn_id",
+                table_name="connection",
+                columns=["conn_id"])
+
+        elif conn.dialect.name == "sqlite":
+            # use batch_alter_table to support SQLite workaround
+            with op.batch_alter_table('connection') as batch_op:
+                batch_op.create_unique_constraint(
+                    constraint_name="unique_conn_id",
+                    columns=["conn_id"]
+                )

Review comment:
       
   ```suggestion
             
               with op.batch_alter_table('connection') as batch_op:
                   batch_op.create_unique_constraint(
                       constraint_name="unique_conn_id",
                       columns=["conn_id"]
                   )
   ```
   
   You don't need to create your own branching logic. [Alembic](https://alembic.sqlalchemy.org/en/latest/ops.html#alembic.operations.Operations.batch_alter_table) will handle it for you internally. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #9067:
URL: https://github.com/apache/airflow/pull/9067


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Hasan-J closed pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
Hasan-J closed pull request #9067:
URL: https://github.com/apache/airflow/pull/9067


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-667612422


   > Got it, is there something I can do to help with troubleshooting?
   
   @ashb  asked for changes to this ticket, but is now on vacation.  This blocks the merging of this change.  Could you close this PR and create a new one?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#discussion_r440092182



##########
File path: airflow/models/connection.py
##########
@@ -119,6 +120,8 @@ class Connection(Base, LoggingMixin):
     is_extra_encrypted = Column(Boolean, unique=False, default=False)
     _extra = Column('extra', String(5000))
 
+    UniqueConstraint(conn_id, name='unique_conn_id')

Review comment:
       This looks odd -- I would have expected it to be this
   ```suggestion
       __table_args__ = (
           UniqueConstraint(conn_id, name='unique_conn_id'),
       )
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Hasan-J edited a comment on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
Hasan-J edited a comment on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-667655525


   I think we're ready to merge this!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9067: [AIRFLOW-8608] Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-640912099






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #9067: [AIRFLOW-8608] Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#discussion_r433742008



##########
File path: tests/secrets/test_secrets_backends.py
##########
@@ -70,20 +69,6 @@ def test_connection_env_secrets_backend(self):
         # we could make this more precise by defining __eq__ method for Connection
         self.assertEqual(sample_conn_1.host.lower(), conn.host)
 
-    def test_connection_metastore_secrets_backend(self):
-        sample_conn_2a = SampleConn("sample_2", "A")
-        sample_conn_2b = SampleConn("sample_2", "B")
-        with create_session() as session:
-            session.add(sample_conn_2a.conn)
-            session.add(sample_conn_2b.conn)
-            session.commit()
-        metastore_backend = MetastoreBackend()
-        conn_list = metastore_backend.get_connections("sample_2")
-        host_list = {x.host for x in conn_list}
-        self.assertEqual(
-            {sample_conn_2a.host.lower(), sample_conn_2b.host.lower()}, set(host_list)
-        )
-

Review comment:
       I think we still need this test but maybe modify it to test one connection instead of two.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Hasan-J commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
Hasan-J commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-667655525


   I think we're ready to merge this !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#discussion_r445427971



##########
File path: airflow/models/connection.py
##########
@@ -156,6 +156,62 @@ class Connection(Base, LoggingMixin):
     is_extra_encrypted = Column(Boolean, unique=False, default=False)
     _extra = Column('extra', String(5000))
 
+    _types = [
+        ('docker', 'Docker Registry'),
+        ('elasticsearch', 'Elasticsearch'),
+        ('exasol', 'Exasol'),
+        ('facebook_social', 'Facebook Social'),
+        ('fs', 'File (path)'),
+        ('ftp', 'FTP'),
+        ('google_cloud_platform', 'Google Cloud Platform'),
+        ('hdfs', 'HDFS'),
+        ('http', 'HTTP'),
+        ('pig_cli', 'Pig Client Wrapper'),
+        ('hive_cli', 'Hive Client Wrapper'),
+        ('hive_metastore', 'Hive Metastore Thrift'),
+        ('hiveserver2', 'Hive Server 2 Thrift'),
+        ('jdbc', 'JDBC Connection'),
+        ('odbc', 'ODBC Connection'),
+        ('jenkins', 'Jenkins'),
+        ('mysql', 'MySQL'),
+        ('postgres', 'Postgres'),
+        ('oracle', 'Oracle'),
+        ('vertica', 'Vertica'),
+        ('presto', 'Presto'),
+        ('s3', 'S3'),
+        ('samba', 'Samba'),
+        ('sqlite', 'Sqlite'),
+        ('ssh', 'SSH'),
+        ('cloudant', 'IBM Cloudant'),
+        ('mssql', 'Microsoft SQL Server'),
+        ('mesos_framework-id', 'Mesos Framework ID'),
+        ('jira', 'JIRA'),
+        ('redis', 'Redis'),
+        ('wasb', 'Azure Blob Storage'),
+        ('databricks', 'Databricks'),
+        ('aws', 'Amazon Web Services'),
+        ('emr', 'Elastic MapReduce'),
+        ('snowflake', 'Snowflake'),
+        ('segment', 'Segment'),
+        ('sqoop', 'Sqoop'),
+        ('azure_batch', 'Azure Batch Service'),
+        ('azure_data_lake', 'Azure Data Lake'),
+        ('azure_container_instances', 'Azure Container Instances'),
+        ('azure_cosmos', 'Azure CosmosDB'),
+        ('azure_data_explorer', 'Azure Data Explorer'),
+        ('cassandra', 'Cassandra'),
+        ('qubole', 'Qubole'),
+        ('mongo', 'MongoDB'),
+        ('gcpcloudsql', 'Google Cloud SQL'),
+        ('grpc', 'GRPC Connection'),
+        ('yandexcloud', 'Yandex Cloud'),
+        ('livy', 'Apache Livy'),
+        ('tableau', 'Tableau'),
+        ('kubernetes', 'Kubernetes cluster Connection'),
+        ('spark', 'Spark'),
+        ('imap', 'IMAP')
+    ]
+

Review comment:
       Are these used anywhere? It doesn't appear to be, if so please don't add them in this PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Hasan-J edited a comment on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
Hasan-J edited a comment on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-644938995


   > I think the airflow.secrets.local_filesystem needs updating too -- it still supports multiple connections with the same id.
   
   Oh, this is my first time reading about this backend. Should we stop and raise an exception in case of multiple connections or just load the last one specified ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-667655943


   Awesome work, congrats on your first merged pull request!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Hasan-J commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
Hasan-J commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-667656181


   I really appreciate all of you. Thank you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-668101338


   I thought about this change even longer and I think we need to make additional changes. The change to the database is not enough, but we also need to update the Python API.
   https://github.com/apache/airflow/issues/10135


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9067: [AIRFLOW-8608] Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-636182487


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better πŸš€.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://apache-airflow-slack.herokuapp.com/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#discussion_r450138016



##########
File path: UPDATING.md
##########
@@ -61,6 +61,14 @@ More tips can be found in the guide:
 https://developers.google.com/style/inclusive-documentation
 
 -->
+### Unique conn_id

Review comment:
       ```suggestion
   ### Connections must now have a unique conn_id
   ```

##########
File path: UPDATING.md
##########
@@ -61,6 +61,14 @@ More tips can be found in the guide:
 https://developers.google.com/style/inclusive-documentation
 
 -->
+### Unique conn_id
+
+Previously, airflow allowed users to add more than one connection with the same ``conn_id`` and on access it would choose one connection randomly. This acted as a basic load balancing and fault tolerance technique, when used in conjunction with retries.

Review comment:
       ```suggestion
   Previously, Airflow allowed users to add more than one connection with the same `conn_id` and on access it would choose one connection randomly. This acted as a basic load balancing and fault tolerance technique, when used in conjunction with retries.
   ```

##########
File path: airflow/models/connection.py
##########
@@ -145,7 +145,7 @@ class Connection(Base, LoggingMixin):
     __tablename__ = "connection"
 
     id = Column(Integer(), primary_key=True)
-    conn_id = Column(String(ID_LEN))
+    conn_id = Column(String(ID_LEN), unique=True)

Review comment:
       ```suggestion
       conn_id = Column(String(ID_LEN), unique=True, nullable=False)
   ```
   
   

##########
File path: UPDATING.md
##########
@@ -61,6 +61,14 @@ More tips can be found in the guide:
 https://developers.google.com/style/inclusive-documentation
 
 -->
+### Unique conn_id
+
+Previously, airflow allowed users to add more than one connection with the same ``conn_id`` and on access it would choose one connection randomly. This acted as a basic load balancing and fault tolerance technique, when used in conjunction with retries.
+
+This behavior caused some confusion for users, and there was no clear evidence if it actually worked well or not.
+
+Now the ``conn_id`` will be unique. New users will not have to deal with any consequences. But for existing users that already have data in their metadata database, they will have to manage those duplicate connections before upgrading the database with ``airflow db upgrade``.

Review comment:
       ```suggestion
   Now the `conn_id` will be unique. New users will not have to deal with any consequences. But for existing users that already have data in their metadata database, they will have to manage those duplicate connections before upgrading the database with `airflow db upgrade`.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-668185084


   @Hasan-J Can you comment on the ticket? Without it, I can't assign you to it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#discussion_r441517352



##########
File path: airflow/models/connection.py
##########
@@ -119,6 +120,8 @@ class Connection(Base, LoggingMixin):
     is_extra_encrypted = Column(Boolean, unique=False, default=False)
     _extra = Column('extra', String(5000))
 
+    UniqueConstraint(conn_id, name='unique_conn_id')

Review comment:
       Actually - even simpler is adding `unique=True` on Line 111: `conn_type = Column(String(500))`- ? `conn_type = Column(String(500), unique=True)`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-667655953


   Wohoo :tada: !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-647117879


   @Hasan-J I think It would be a better idea to raise an exception.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-667614366


   I dismissed the review from @ashb now - we think all has been addressed here. So you only need to rebase it now @Hasan-J !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#discussion_r445428859



##########
File path: airflow/models/connection.py
##########
@@ -156,6 +156,62 @@ class Connection(Base, LoggingMixin):
     is_extra_encrypted = Column(Boolean, unique=False, default=False)
     _extra = Column('extra', String(5000))
 
+    _types = [
+        ('docker', 'Docker Registry'),
+        ('elasticsearch', 'Elasticsearch'),
+        ('exasol', 'Exasol'),
+        ('facebook_social', 'Facebook Social'),
+        ('fs', 'File (path)'),
+        ('ftp', 'FTP'),
+        ('google_cloud_platform', 'Google Cloud Platform'),
+        ('hdfs', 'HDFS'),
+        ('http', 'HTTP'),
+        ('pig_cli', 'Pig Client Wrapper'),
+        ('hive_cli', 'Hive Client Wrapper'),
+        ('hive_metastore', 'Hive Metastore Thrift'),
+        ('hiveserver2', 'Hive Server 2 Thrift'),
+        ('jdbc', 'JDBC Connection'),
+        ('odbc', 'ODBC Connection'),
+        ('jenkins', 'Jenkins'),
+        ('mysql', 'MySQL'),
+        ('postgres', 'Postgres'),
+        ('oracle', 'Oracle'),
+        ('vertica', 'Vertica'),
+        ('presto', 'Presto'),
+        ('s3', 'S3'),
+        ('samba', 'Samba'),
+        ('sqlite', 'Sqlite'),
+        ('ssh', 'SSH'),
+        ('cloudant', 'IBM Cloudant'),
+        ('mssql', 'Microsoft SQL Server'),
+        ('mesos_framework-id', 'Mesos Framework ID'),
+        ('jira', 'JIRA'),
+        ('redis', 'Redis'),
+        ('wasb', 'Azure Blob Storage'),
+        ('databricks', 'Databricks'),
+        ('aws', 'Amazon Web Services'),
+        ('emr', 'Elastic MapReduce'),
+        ('snowflake', 'Snowflake'),
+        ('segment', 'Segment'),
+        ('sqoop', 'Sqoop'),
+        ('azure_batch', 'Azure Batch Service'),
+        ('azure_data_lake', 'Azure Data Lake'),
+        ('azure_container_instances', 'Azure Container Instances'),
+        ('azure_cosmos', 'Azure CosmosDB'),
+        ('azure_data_explorer', 'Azure Data Explorer'),
+        ('cassandra', 'Cassandra'),
+        ('qubole', 'Qubole'),
+        ('mongo', 'MongoDB'),
+        ('gcpcloudsql', 'Google Cloud SQL'),
+        ('grpc', 'GRPC Connection'),
+        ('yandexcloud', 'Yandex Cloud'),
+        ('livy', 'Apache Livy'),
+        ('tableau', 'Tableau'),
+        ('kubernetes', 'Kubernetes cluster Connection'),
+        ('spark', 'Spark'),
+        ('imap', 'IMAP')
+    ]
+

Review comment:
       Other than this change this PR looks good to me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-662178887


   I can see CI/CD is sad. Can you fix it?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
feluelle commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-668184914


   @Hasan-J please add a comment to the issue so we can assign it to you. :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-668101338


   I thought about this change even longer and I think we need to make additional changes. The change to the database is not enough, but we also need to update the Python API. Would you like to deal with it?
   https://github.com/apache/airflow/issues/10135


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-647117879


   @Hasan-J It would be a better idea to raise an exception.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Hasan-J commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
Hasan-J commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-665667581


   Got it, is there something I can do to help with troubleshooting?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-668185084


   @Hasan-J Can you comment on the ticket? Without it, I can't assign you to him.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#discussion_r464053787



##########
File path: UPDATING.md
##########
@@ -322,6 +322,14 @@ We should default it as `true` to avoid confusion
 In order to migrate the database, you should use the command `airflow db upgrade`, but in
 some cases manual steps are required.
 
+#### Unique conn_id in connection table
+
+Previously, Airflow allowed users to add more than one connection with the same `conn_id` and on access it would choose one connection randomly. This acted as a basic load balancing and fault tolerance technique, when used in conjunction with retries.
+
+This behavior caused some confusion for users, and there was no clear evidence if it actually worked well or not.
+
+Now the `conn_id` will be unique. If you already have duplicates your metadata database, you will have to manage those duplicate connections before upgrading the database.

Review comment:
       ```suggestion
   Now the `conn_id` will be unique. If you already have duplicates in your metadata database, you will have to manage those duplicate connections before upgrading the database.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Hasan-J edited a comment on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
Hasan-J edited a comment on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-667655525


   @mik-laj I think we're ready to merge this!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Hasan-J commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
Hasan-J commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-644938995


   > I think the airflow.secrets.local_filesystem needs updating too -- it still supports multiple connections with the same id.
   
   Oh, this is my first time reading about this backend. Should we stop and raise an exception in case of multiple connections or just load that last one specified ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Hasan-J commented on pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
Hasan-J commented on pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#issuecomment-668172072


   I've also had my thoughts about this change, but didn't mind it because it "worked".
   I will take a look and open a PR soon :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9067: [AIRFLOW-8608] Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#discussion_r440092182



##########
File path: airflow/models/connection.py
##########
@@ -119,6 +120,8 @@ class Connection(Base, LoggingMixin):
     is_extra_encrypted = Column(Boolean, unique=False, default=False)
     _extra = Column('extra', String(5000))
 
+    UniqueConstraint(conn_id, name='unique_conn_id')

Review comment:
       This looks odd -- I would have expected it to be this
   ```suggestion
       __table_args = (
           UniqueConstraint(conn_id, name='unique_conn_id'),
       )
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #9067: Make conn_id unique in Connections table

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #9067:
URL: https://github.com/apache/airflow/pull/9067#discussion_r464053997



##########
File path: airflow/migrations/versions/8d48763f6d53_add_unique_constraint_to_conn_id.py
##########
@@ -0,0 +1,69 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""add unique constraint to conn_id
+
+Revision ID: 8d48763f6d53
+Revises: 8f966b9c467a
+Create Date: 2020-05-03 16:55:01.834231
+
+"""
+
+import sqlalchemy as sa
+from alembic import op
+
+from airflow.models.base import ID_LEN

Review comment:
       This value can change in future.
   
   So we should hard code it in the migration itself, instead of importing




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org