You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "jaegwonseo (via GitHub)" <gi...@apache.org> on 2023/08/27 14:49:53 UTC

[GitHub] [airflow] jaegwonseo opened a new pull request, #33796: Add druid ingestion connection test

jaegwonseo opened a new pull request, #33796:
URL: https://github.com/apache/airflow/pull/33796

   expose druid connection and implement test
   
   jira issue link : https://github.com/apache/airflow/issues/33795
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hussein-awala commented on a diff in pull request #33796: Add druid ingestion connection test

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1314009187


##########
airflow/providers/apache/druid/hooks/druid.py:
##########
@@ -149,6 +154,20 @@ def submit_indexing_job(
 
         self.log.info("Successful index")
 
+    def test_connection(self) -> tuple[bool, str]:
+        try:
+            conn = self.get_connection(self.druid_ingest_conn_id)
+            host = conn.host
+            port = conn.port
+            # ref : https://druid.apache.org/docs/latest/operations/api-reference/#tasks
+            response = requests.get(f"http://{host}:{port}/druid/indexer/v1/tasks")

Review Comment:
   One of the main reasons for connection and test_connection method is storing the credentials securely and test them via this method, so IMHO this method should test if the credentials are valid or not. You can try something like:
   ```suggestion
               response = requests.get(
                   f"http://{host}:{port}/druid/indexer/v1/tasks",
                    auth=self.get_auth(),
                )
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #33796: Add druid ingestion connection test

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #33796:
URL: https://github.com/apache/airflow/pull/33796#issuecomment-1694739723

   I think better approach woudl be to keep one of the connections (DBApi ?) as "druid". It will at least make existing DBApi connections work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #33796: Add druid ingestion connection test

Posted by "eladkal (via GitHub)" <gi...@apache.org>.
eladkal commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1312710041


##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+3.6.0
+.....
+
+Features
+~~~~~~~~
+

Review Comment:
   Yes.
   In changelog just place the info at the top. I will pick it and arrange it during release process.
   
   I think we can agree this is a breaking change so
   @jaegwonseo can you please add  4.0.0 entry in provider yaml?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jaegwonseo commented on a diff in pull request #33796: Add druid ingestion connection test

Posted by "jaegwonseo (via GitHub)" <gi...@apache.org>.
jaegwonseo commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1313832765


##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+4.0.0
+.....
+
+Features
+~~~~~~~~
+
+* ``Add druid ingestion connection test And expose druid ingest hook (#33795)``
+
+.. note::
+  The connection type of Druid has been separated into two(druid_broker, druid_ingest)

Review Comment:
   @eladkal 
   I'm not an English speaker, so I'm not sure if the following mention is accurate. Please review this
   
   ```
   The connection type of Druid has been separated into two(Druid Broker, Druid Ingest). Please perform one of the two methods listed below.
     1. update druid connect type to (Druid Broker, Druid Ingest) on UI screen (Admin/Connections)
     2. run command airflow db migrate
   ```
   
   **but in second case, airflow db migrate doesn't support update connect type** so we need implement this. 
   my suggestion is write first option only, and add an implementation to check the version of the Druid provider in the db migrate process. how about do this in other issue
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #33796: Add druid ingestion connection test

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #33796:
URL: https://github.com/apache/airflow/pull/33796#issuecomment-1694739966

   Also - this one woll need a changelog entry describing what changed and how users should modify their connections


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #33796: Add druid ingestion connection test

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1307887349


##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+3.6.0
+.....
+
+Features
+~~~~~~~~
+

Review Comment:
   We should remove this one - it's up to release manager to decide a version. I think it shoud be 4.0.0 as the connection change is breaking. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jaegwonseo commented on a diff in pull request #33796: Add druid ingestion connection test

Posted by "jaegwonseo (via GitHub)" <gi...@apache.org>.
jaegwonseo commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1314256500


##########
airflow/providers/apache/druid/hooks/druid.py:
##########
@@ -149,6 +154,20 @@ def submit_indexing_job(
 
         self.log.info("Successful index")
 
+    def test_connection(self) -> tuple[bool, str]:
+        try:
+            conn = self.get_connection(self.druid_ingest_conn_id)
+            host = conn.host
+            port = conn.port
+            # ref : https://druid.apache.org/docs/latest/operations/api-reference/#tasks
+            response = requests.get(f"http://{host}:{port}/druid/indexer/v1/tasks")

Review Comment:
   @hussein-awala 
   The Druid Broker Hook overrides the DbApiHook, but the test connection is not implemented to use authentication information
   https://github.com/apache/airflow/blob/main/airflow/providers/common/sql/hooks/sql.py#L550
   
   So, I think adding authentication information to the Druid hook (broker, ingest) is handled in another issue
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add druid ingestion connection test [airflow]

Posted by "eladkal (via GitHub)" <gi...@apache.org>.
eladkal commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1367842425


##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+4.0.0
+.....
+
+Features
+~~~~~~~~
+
+* ``Add druid ingestion connection test And expose druid ingest hook (#33795)``
+
+.. note::
+  The connection type of Druid has been separated into two(druid_broker, druid_ingest)

Review Comment:
   The goal of the note is to explain what actions needed to be done in order to migrate.
   The note you listed says the conn is split in two so:
   1. How do I choose which one?
   2. If I want to have my code work exactly as it is before what steps I need to do?
   
   These are the two questions the notes must answer



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #33796: Add druid ingestion connection test

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #33796:
URL: https://github.com/apache/airflow/pull/33796#issuecomment-1696331877

   > In order to execute queries within a distributed system, Druid utilizes broker nodes, and for data ingestion, it leverages ingest nodes. Since each node has distinct roles and invocation addresses, maintaining the current setup with these two separate nodes would be preferable.
   
   I see. Ok then it should be a breaking change - but we let release manager decide on that - entry in changeloge should not contain the version (see suggestion).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #33796: Add druid ingestion connection test

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1307886267


##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+3.6.0
+.....
+
+Features
+~~~~~~~~
+

Review Comment:
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add druid ingestion connection test [airflow]

Posted by "eladkal (via GitHub)" <gi...@apache.org>.
eladkal commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1367842425


##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+4.0.0
+.....
+
+Features
+~~~~~~~~
+
+* ``Add druid ingestion connection test And expose druid ingest hook (#33795)``
+
+.. note::
+  The connection type of Druid has been separated into two(druid_broker, druid_ingest)

Review Comment:
   The goal of the note is to explain what actions needed to be done in order to migrate.
   The note you listed says the conn is split in two so:
   1. How do I choose which one? (Or/And what is the motivation to the split)
   2. If I want to have my code work exactly as it is before what steps I need to do?
   
   These are the two questions the notes must answer



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jaegwonseo commented on a diff in pull request #33796: Add druid ingestion connection test

Posted by "jaegwonseo (via GitHub)" <gi...@apache.org>.
jaegwonseo commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1312747085


##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+3.6.0
+.....
+
+Features
+~~~~~~~~
+

Review Comment:
   @potiuk @eladkal 
   i'm sorry for late replying
   i'm gonna do that tonight



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add druid ingestion connection test [airflow]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #33796:
URL: https://github.com/apache/airflow/pull/33796#issuecomment-1843918310

   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jaegwonseo commented on pull request #33796: Add druid ingestion connection test

Posted by "jaegwonseo (via GitHub)" <gi...@apache.org>.
jaegwonseo commented on PR #33796:
URL: https://github.com/apache/airflow/pull/33796#issuecomment-1695791861

   > I think better approach woudl be to keep one of the connections (DBApi ?) as "druid". It will at least make existing DBApi connections work.
   
   > Also - this one woll need a changelog entry describing what changed and how users should modify their connections
   
   @potiuk 
   In order to execute queries within a distributed system, Druid utilizes broker nodes, and for data ingestion, it leverages ingest nodes. Since each node has distinct roles and invocation addresses, maintaining the current setup with these two separate nodes would be preferable.
   and I do not know the exact criteria for releasing new versions in the airflow provider, so I assumed 3.6.0 for now and added the change log.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jaegwonseo commented on a diff in pull request #33796: Add druid ingestion connection test

Posted by "jaegwonseo (via GitHub)" <gi...@apache.org>.
jaegwonseo commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1313832765


##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+4.0.0
+.....
+
+Features
+~~~~~~~~
+
+* ``Add druid ingestion connection test And expose druid ingest hook (#33795)``
+
+.. note::
+  The connection type of Druid has been separated into two(druid_broker, druid_ingest)

Review Comment:
   @eladkal 
   I'm not an English speaker, so I'm not sure if the following mention is accurate. Please review this
   
   ```
   The connection type of Druid has been separated into two(Druid Broker, Druid Ingest). Please perform one of the two methods listed below.
     1. update druid connect type to (Druid Broker, Druid Ingest) on UI screen (Admin/Connections)
     2. run command airflow db migrate
   ```
   
   **but in second case, airflow db migrate doesn't support update connect type** so we need implement this. 
   my suggestion is write first option only, and add an implementation to check the version of the Druid provider in the db migrate process, so that we can upgrade the Druid connector type when needed in Airflow in other issue.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add druid ingestion connection test [airflow]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed pull request #33796: Add druid ingestion connection test
URL: https://github.com/apache/airflow/pull/33796


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #33796: Add druid ingestion connection test

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1313902723


##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+4.0.0
+.....
+
+Features
+~~~~~~~~
+
+* ``Add druid ingestion connection test And expose druid ingest hook (#33795)``
+
+.. note::
+  The connection type of Druid has been separated into two(druid_broker, druid_ingest)

Review Comment:
   Nope, we don't have connection migrations, and with current release model (that doesn't mean it is bad) I don't think it even possible to implement it. All migration happen only on upgrade Airflow Core, not Providers.
   
   Just some examples of the past breaking changes and their descriptions:
   
   1. Remove AWS S3 Connection (spoiler alert, it does impact some users)
   https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/changelog.html#id66
   
   2. Removed deprecated fields in Slack Provider (still wait when users starts complain)
   https://airflow.apache.org/docs/apache-airflow-providers-slack/stable/changelog.html#breaking-changes
   
   3. Deprecated parameter removed from Google Provider
   https://airflow.apache.org/docs/apache-airflow-providers-google/stable/changelog.html#breaking-changes
   
   The main point of this breaking changes. When user open issue/discussion in Github or write in slack it much easier to send them link to Breaking Changes, rather then try to figure out how to solve their issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jaegwonseo commented on a diff in pull request #33796: Add druid ingestion connection test

Posted by "jaegwonseo (via GitHub)" <gi...@apache.org>.
jaegwonseo commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1314253423


##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+4.0.0
+.....
+
+Features
+~~~~~~~~
+
+* ``Add druid ingestion connection test And expose druid ingest hook (#33795)``
+
+.. note::
+  The connection type of Druid has been separated into two(druid_broker, druid_ingest)

Review Comment:
   @eladkal @Taragolis 
   Taragolis thanks for adding example link
   in this pr the only concern point is changing connection type. So, I think this mention is enough.
   ```
   In this version of provider Apache Druid Connection(conn_type="druid") has been separated into two(Druid Broker, Druid Ingest)
   Update connect type(conn_type = "druid") to druid broker, druid ingest 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add druid ingestion connection test [airflow]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #33796:
URL: https://github.com/apache/airflow/pull/33796#issuecomment-1769668133

   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #33796: Add druid ingestion connection test

Posted by "eladkal (via GitHub)" <gi...@apache.org>.
eladkal commented on code in PR #33796:
URL: https://github.com/apache/airflow/pull/33796#discussion_r1313475943


##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+4.0.0
+.....
+
+Features
+~~~~~~~~
+
+* ``Add druid ingestion connection test And expose druid ingest hook (#33795)``
+
+.. note::
+  The connection type of Druid has been separated into two(druid_broker, druid_ingest)

Review Comment:
   Please elaborate on this one. We should explain to users how to mitigate the issue (leave instructions how to migrate to this new version)



##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+4.0.0
+.....
+
+Features
+~~~~~~~~
+
+* ``Add druid ingestion connection test And expose druid ingest hook (#33795)``
+
+.. note::
+  The connection type of Druid has been separated into two(druid_broker, druid_ingest)
+
+---------

Review Comment:
   ```suggestion
   ```



##########
airflow/providers/apache/druid/CHANGELOG.rst:
##########
@@ -27,6 +27,19 @@
 Changelog
 ---------
 
+4.0.0
+.....
+
+Features
+~~~~~~~~
+
+* ``Add druid ingestion connection test And expose druid ingest hook (#33795)``

Review Comment:
   ```suggestion
   ```
   
   No need for this. This is being done automatically during release proccess



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on pull request #33796: Add druid ingestion connection test

Posted by "eladkal (via GitHub)" <gi...@apache.org>.
eladkal commented on PR #33796:
URL: https://github.com/apache/airflow/pull/33796#issuecomment-1703276977

   It's almost ready :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org