You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/08/31 12:58:39 UTC

[GitHub] [airflow] dferguson992 opened a new pull request #10660: [AIRFLOW-6786] Added Kafka components

dferguson992 opened a new pull request #10660:
URL: https://github.com/apache/airflow/pull/10660


   Dear Airflow Maintainers,
   
   Please accept the following PR that
   
   Add the KafkaProducerHook.
   Add the KafkaConsumerHook.
   Add the KafkaSensor which listens to messages with a specific topic.
   Related Issue:
   #1311
   
   Issue link: AIRFLOW-6786
   
   Make sure to mark the boxes below before creating PR: [x]
   
    Description above provides context of the change
    Commit message/PR title starts with [AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID*
    Unit tests coverage for changes (not needed for documentation changes)
    Commits follow "How to write a good git commit message"
    Relevant documentation is updated including usage instructions.
    I will engage committers as explained in Contribution Workflow Example.
   For document-only changes commit message can start with [AIRFLOW-XXXX].
   Reminder to contributors:
   
   You must add an Apache License header to all new files
   Please squash your commits when possible and follow the 7 rules of good Git commits
   I am new to the community, I am not sure the files are at the right place or missing anything.
   
   The sensor could be used as the first node of a dag where the second node can be a TriggerDagRunOperator. The messages are polled in a batch and the dag runs are dynamically generated.
   
   Thanks!
   
   Note, as per denied PR #1415, it is important to mention these integrations are not suitable for low-latency/high-throughput/streaming. For reference, #1415 (comment).
   
   Co-authored-by: Dan Ferguson dferguson992@gmail.com
   Co-authored-by: YuanfΞi Zhu


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dferguson992 commented on a change in pull request #10660: [AIRFLOW-6786] Added Kafka components

Posted by GitBox <gi...@apache.org>.
dferguson992 commented on a change in pull request #10660:
URL: https://github.com/apache/airflow/pull/10660#discussion_r492013769



##########
File path: airflow/providers/google/cloud/example_dags/example_dataproc.py
##########
@@ -65,10 +65,16 @@
 # Update options
 # [START how_to_cloud_dataproc_updatemask_cluster_operator]
 CLUSTER_UPDATE = {
-    "config": {"worker_config": {"num_instances": 3}, "secondary_worker_config": {"num_instances": 3}}

Review comment:
       i've read through this guide and i'm still not sure what's happening here.  I've joined the slack channel and raised my concerns there as well




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #10660: [AIRFLOW-6786] Added Kafka components

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #10660:
URL: https://github.com/apache/airflow/pull/10660#discussion_r480149765



##########
File path: docs/operators-and-hooks-ref.rst
##########
@@ -176,6 +176,14 @@ Foundation.
        :mod:`airflow.providers.apache.hive.sensors.hive_partition`,
        :mod:`airflow.providers.apache.hive.sensors.metastore_partition`
 
+  * - `Apache Kafka <https://kafka.apache.org/>`__

Review comment:
       ```suggestion
      * - `Apache Kafka <https://kafka.apache.org/>`__
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ryw commented on pull request #10660: [AIRFLOW-6786] Added Kafka components

Posted by GitBox <gi...@apache.org>.
ryw commented on pull request #10660:
URL: https://github.com/apache/airflow/pull/10660#issuecomment-685711664


   @mrrobby do you want to review / test this out?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ryw commented on pull request #10660: [AIRFLOW-6786] Added Kafka components

Posted by GitBox <gi...@apache.org>.
ryw commented on pull request #10660:
URL: https://github.com/apache/airflow/pull/10660#issuecomment-736991032


   @dferguson992 curious why you closed this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dferguson992 commented on a change in pull request #10660: [AIRFLOW-6786] Added Kafka components

Posted by GitBox <gi...@apache.org>.
dferguson992 commented on a change in pull request #10660:
URL: https://github.com/apache/airflow/pull/10660#discussion_r492013769



##########
File path: airflow/providers/google/cloud/example_dags/example_dataproc.py
##########
@@ -65,10 +65,16 @@
 # Update options
 # [START how_to_cloud_dataproc_updatemask_cluster_operator]
 CLUSTER_UPDATE = {
-    "config": {"worker_config": {"num_instances": 3}, "secondary_worker_config": {"num_instances": 3}}

Review comment:
       i've read through this guide and i'm still not sure what's happening here.  I've joined the slack channel and raised my concerns there as well




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RosterIn commented on a change in pull request #10660: [AIRFLOW-6786] Added Kafka components

Posted by GitBox <gi...@apache.org>.
RosterIn commented on a change in pull request #10660:
URL: https://github.com/apache/airflow/pull/10660#discussion_r490910286



##########
File path: airflow/providers/google/cloud/example_dags/example_dataproc.py
##########
@@ -65,10 +65,16 @@
 # Update options
 # [START how_to_cloud_dataproc_updatemask_cluster_operator]
 CLUSTER_UPDATE = {
-    "config": {"worker_config": {"num_instances": 3}, "secondary_worker_config": {"num_instances": 3}}

Review comment:
       There is a really good guide at the contributing section:
   https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#id9
   On slack there is a channel airflow-how-to-pr where you can ask for help if you get into trouble.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dferguson992 closed pull request #10660: [AIRFLOW-6786] Added Kafka components

Posted by GitBox <gi...@apache.org>.
dferguson992 closed pull request #10660:
URL: https://github.com/apache/airflow/pull/10660


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RosterIn commented on a change in pull request #10660: [AIRFLOW-6786] Added Kafka components

Posted by GitBox <gi...@apache.org>.
RosterIn commented on a change in pull request #10660:
URL: https://github.com/apache/airflow/pull/10660#discussion_r490279256



##########
File path: airflow/providers/google/cloud/example_dags/example_dataproc.py
##########
@@ -65,10 +65,16 @@
 # Update options
 # [START how_to_cloud_dataproc_updatemask_cluster_operator]
 CLUSTER_UPDATE = {
-    "config": {"worker_config": {"num_instances": 3}, "secondary_worker_config": {"num_instances": 3}}

Review comment:
       How is adding kafka integration related to modifying google dag examples?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dferguson992 commented on a change in pull request #10660: [AIRFLOW-6786] Added Kafka components

Posted by GitBox <gi...@apache.org>.
dferguson992 commented on a change in pull request #10660:
URL: https://github.com/apache/airflow/pull/10660#discussion_r490895415



##########
File path: airflow/providers/google/cloud/example_dags/example_dataproc.py
##########
@@ -65,10 +65,16 @@
 # Update options
 # [START how_to_cloud_dataproc_updatemask_cluster_operator]
 CLUSTER_UPDATE = {
-    "config": {"worker_config": {"num_instances": 3}, "secondary_worker_config": {"num_instances": 3}}

Review comment:
       idk how those even got in.  Whenever i can work on this, i rebase from master and a whole bunch of changes get added..  i think I'm rebasing wrong but i don't understand how.  `git rebase -i <my_branch> master`?
   
   I never have enough time to really isolate the issue with this PR, especially since everytime i try toupdate my local branch i incur other commits that I didn't write.  not sure what's wrong with this at all.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org