You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/11/10 14:15:09 UTC

[GitHub] [beam] ilya-kozyrev commented on pull request #13112: [BEAM-11065] Apache Beam Template to ingest from Apache Kafka to Google Pub/Sub

ilya-kozyrev commented on pull request #13112:
URL: https://github.com/apache/beam/pull/13112#issuecomment-724728674


   > Hi Ilya,
   > Thank you so much for contributing this template.
   > However, Beam is not the right repo to contain templates.
   > All the Dataflow templates are located here (https://github.com/GoogleCloudPlatform/DataflowTemplates).
   > Can you please create the PR here and I would help to get it reviewed and approved.
   > Apologies for all the confusion and for the back and forth.
   > 
   > Thank you so much again for this.
   > 
   > PS: The DataflowTemplates repo holds many other template examples too.
   
   Hi Manav,
   
   Thank you very much for your comment! Let me clarify a bit. 
   We have implemented this template for the Beam repository for users, who would like to use the beam in quite common use cases like this (Kafka -> Pub/Sub). They could be able to get a solution that does not require changes or requires minor improvements. We want to scale this part of the Beam repository. 
   
   The very important thing is that the Dataflow runner is an optional runner for this template. The template is focusing on different runners and we don't include any specific libraries inside. Sure, if you want, you could build it to GCP and use the template in Google Dataflow, for this case we have rich documentation in Java Doc and readme, but this approach is only one of the many awesome approaches of how we can use templates.
   
   We don't want to focus only on GCP suggesting adding templates to Beam. We want to give the ability for Beam users to get ready-to-use solutions for common use cases. 
   
   > Also, I already see a PR for the same there: [GoogleCloudPlatform/DataflowTemplates#176](https://github.com/GoogleCloudPlatform/DataflowTemplates/pull/176)
   > I guess this should be sufficient ?
   
   As you mentioned right, we implemented a different template in the [GoogleDataflow templates repository](https://github.com/GoogleCloudPlatform/DataflowTemplates) for a similar use-case. In the [template for Dataflow](https://github.com/GoogleCloudPlatform/DataflowTemplates/pull/176) we use specific libraries, coders, and functions that can be used only with Dataflow runner and built from the DataflowTemplates repository. But in this PR we suggest adding a more generic template and give the ability to the community to extend templates in the future.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org