You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Weiqing Chen <we...@outlook.com> on 2022/02/11 22:39:36 UTC

Student Interested in BEAM-13865 problem for GSoc2021

Dear:

I am Chen Weiqing and I am currently a master's student majoring in Computer Science at the University of Virginia. I have read the proposal and I felt that I am interested in taking the problem BEAM-13865 for GSOC 2021. May I ask where should I start from?

Thanks

Chen Weiqing

Re: Student Interested in BEAM-13865 problem for GSoc2021

Posted by Pablo Estrada <pa...@google.com>.
Hi Weiqing!
As Kerry mentions, it is good to familiarize yourself with the general
objective of a Beam IO connector, and a sink in particular. Basically: You
need your Beam pipeline to write data to an external data store. How does
someone write this in Beam?
You can look at some examples in Java or Python SDKs. Later you can also
look at this document: https://s.apache.org/beam-io-api-standard - it
contains guidelines for someone writing a new API for Beam.

In general, if you'd like to start getting familiar with the codebase, I'd
recommend you try out these developer guides:
https://cwiki.apache.org/confluence/display/BEAM/Developer+Guides

And if you feel like it, you can pick up simple / starter issues from
JIRA[1]. Probably for Java, since this project is in Java : )
Feel free to keep asking questions on this mailing list!
-P.

[1]
https://issues.apache.org/jira/browse/BEAM-6035?jql=project%20%3D%20BEAM%20AND%20status%20%3D%20Open%20AND%20labels%20in%20(starter%2C%20newbie)%20ORDER%20BY%20created%20DESC

On Mon, Feb 14, 2022 at 6:40 AM Kerry Donny-Clark <ke...@google.com>
wrote:

> Hi Chen,
> That's great to hear! I think a good first step would be to compare the
> doc and some of the current Java IOs. This will help you understand the
> problem that this is trying to solve.
> I would look at the the IOs in
> https://github.com/apache/beam/tree/master/sdks/java/io, maybe focusing
> on some where you know the IO target (kafka if you have kafka experience,
> for example)
> Then, review https://github.com/apache/beam/pull/16763 and ask questions.
> Finally, Pablo will have more advice, and should reply here soon.
> Kerry
>
> On Fri, Feb 11, 2022 at 6:19 PM Weiqing Chen <we...@outlook.com>
> wrote:
>
>> Dear:
>>
>> I am Chen Weiqing and I am currently a master's student majoring in
>> Computer Science at the University of Virginia. I have read the proposal
>> and I felt that I am interested in taking the problem BEAM-13865 for GSOC
>> 2021. May I ask where should I start from?
>>
>> Thanks
>>
>> Chen Weiqing
>>
>

Re: Student Interested in BEAM-13865 problem for GSoc2021

Posted by Kerry Donny-Clark <ke...@google.com>.
Hi Chen,
That's great to hear! I think a good first step would be to compare the doc
and some of the current Java IOs. This will help you understand the problem
that this is trying to solve.
I would look at the the IOs in
https://github.com/apache/beam/tree/master/sdks/java/io, maybe focusing on
some where you know the IO target (kafka if you have kafka experience, for
example)
Then, review https://github.com/apache/beam/pull/16763 and ask questions.
Finally, Pablo will have more advice, and should reply here soon.
Kerry

On Fri, Feb 11, 2022 at 6:19 PM Weiqing Chen <we...@outlook.com>
wrote:

> Dear:
>
> I am Chen Weiqing and I am currently a master's student majoring in
> Computer Science at the University of Virginia. I have read the proposal
> and I felt that I am interested in taking the problem BEAM-13865 for GSOC
> 2021. May I ask where should I start from?
>
> Thanks
>
> Chen Weiqing
>