You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Đức Trần Tiến <tr...@gmail.com> on 2021/03/24 11:24:29 UTC

[Question] Need to write a pipeline in Go consuming events from Kafka

Hi,

I am very very new to Go and Apache Beam too! This is my situation:
 - I have a kafka running
 - I want to write an etl pipeline that consuming data from the kafka in Go

Because there is no Kafka support in the Go SDK, only Java SDK. I had been
looking for a way to create an unbounded collection in Go then do some
coding to consume the kafka, but no result! *beam.Create()* only supports
to create a finite data source.

Could you please guide me or give me some ideas to make progress? Either
creating an unbounded collection or a straight way to create an etl
pipeline consuming kafka.

And the last question: Could I write that pipeline in Java and invoke that
pipeline from Go? :D

Thanks and regards,

Duc Tran

Re: [Question] Need to write a pipeline in Go consuming events from Kafka

Posted by Robert Burke <re...@google.com>.
At present, there's no way to write an unbounded datasource with the Go
SDK, which would require DoFn Self Checkpointing (
https://issues.apache.org/jira/browse/BEAM-11104) and  Watermark Estimation
(https://issues.apache.org/jira/browse/BEAM-11105).

Daniel is working on wrapping the Java kafka connector in the next few
months using the system Robert linked.  Older kafka + Go specific JIRAs
exist at https://issues.apache.org/jira/browse/BEAM-4250 and
https://issues.apache.org/jira/browse/BEAM-6260, but they should probably
be retired, in favour of one that mentions Xlang specifically. I
believe Daniel will file them once he breaks down the task.

See
https://cwiki.apache.org/confluence/display/BEAM/Supporting+Streaming+in+the+Go+SDK
for more details on what it will take for streaming with the Go SDK.

On Mon, Mar 29, 2021 at 9:28 AM Robert Bradshaw <ro...@google.com> wrote:

> On Wed, Mar 24, 2021 at 4:24 AM Đức Trần Tiến <tr...@gmail.com>
> wrote:
>
>>
>> And the last question: Could I write that pipeline in Java and invoke
>> that pipeline from Go? :D
>>
>
> That is exactly the story we're trying to pursue for getting the large set
> of Java connectors available to Go:
>
>
> https://cloud.google.com/blog/products/data-analytics/multi-language-sdks-for-building-cloud-pipelines
>
>
> Cc'ing some folks that can comment on the status.
>
>

Re: [Question] Need to write a pipeline in Go consuming events from Kafka

Posted by Robert Bradshaw <ro...@google.com>.
On Wed, Mar 24, 2021 at 4:24 AM Đức Trần Tiến <tr...@gmail.com>
wrote:

>
> And the last question: Could I write that pipeline in Java and invoke that
> pipeline from Go? :D
>

That is exactly the story we're trying to pursue for getting the large set
of Java connectors available to Go:

https://cloud.google.com/blog/products/data-analytics/multi-language-sdks-for-building-cloud-pipelines


Cc'ing some folks that can comment on the status.

Re: [Question] Need to write a pipeline in Go consuming events from Kafka

Posted by Brian Hulette <bh...@google.com>.
+Robert Burke <re...@google.com> any advice here?

On Wed, Mar 24, 2021 at 4:24 AM Đức Trần Tiến <tr...@gmail.com>
wrote:

> Hi,
>
> I am very very new to Go and Apache Beam too! This is my situation:
>  - I have a kafka running
>  - I want to write an etl pipeline that consuming data from the kafka in Go
>
> Because there is no Kafka support in the Go SDK, only Java SDK. I had been
> looking for a way to create an unbounded collection in Go then do some
> coding to consume the kafka, but no result! *beam.Create()* only supports
> to create a finite data source.
>
> Could you please guide me or give me some ideas to make progress? Either
> creating an unbounded collection or a straight way to create an etl
> pipeline consuming kafka.
>
> And the last question: Could I write that pipeline in Java and invoke that
> pipeline from Go? :D
>
> Thanks and regards,
>
> Duc Tran
>