You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by Dongwon Kim <ea...@postech.ac.kr> on 2016/02/05 17:14:50 UTC

Want Flink startup issues :-)

Hello,

I'm Dongwon Kim and I want to get involved in Flink community.
Can anyone guide me through contributing to Flink with some startup issues?
Although my research interest lie in big data systems including Flink,
Spark, MapReduce, and Tez, I've never participated in open source
communities.

FYI, I've done the following things for past few years:
- I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and
Apache Spark through the source code.
- My doctoral thesis is about improving the performance of MRv1 by
making network pipelines between mappers and reducers like what Flink
does.
- I've used Ganglia to monitor the cluster performance and I've been
interested in metrics and counters in big data systems.
- I gave a talk named "a comparative performance evaluation of Flink"
at last Flink Forward.

I would be very appreciated if someone can help me get involved in the
most promising ASF project :-)

Greetings,
Dongwon Kim

Re: Want Flink startup issues :-)

Posted by "Matthias J. Sax" <mj...@apache.org>.

For the road map ideas, there are often no JIRAs created yet. Mostly,
road map ideas are more complex things to get done, requiring design
documents and discussions before the actual coding can be done.

Usually, we create the JIRA (or multiple JIRAs) during the design phase.
So just watch the mailing list to keep track of the road map ideas you
are interested in. Of course, if you want to get started with any of
those, you can start the discussion on the mail by yourself and also
start a design document etc.

Just be aware, that this process will take some time, as the community
will give you a lot of feedback etc. If you want to get started more
quickly, working on an existing JIRA with limited scope is a good
starting point -- or you just do both in parallel ;)

-Matthias


On 02/06/2016 02:10 PM, Dongwon Kim wrote:
> Hi Chiwan!
> 
> That's what I wanted to know!
> Thanks!
> 
> Dongwon Kim
> 
> 2016-02-06 22:00 GMT+09:00 Chiwan Park <ch...@apache.org>:
>> Hi Dongwon,
>>
>> Yes, the things to do are picking an issue (by assigning the issue to you or commenting on the issue) and make changes and send a pull request for it.
>>
>> Welcome! :)
>>
>> Regards,
>> Chiwan Park
>>
>>> On Feb 6, 2016, at 3:31 PM, Dongwon Kim <ea...@postech.ac.kr> wrote:
>>>
>>> Hi Fabian, Matthias, Robert!
>>>
>>> Thank you for welcoming me to the community :-)
>>> I'm taking a look at JIRA and "How to contribute" as you guys suggested.
>>> One trivial question is whether I just need to make a pull request
>>> after figuring out issues?
>>> Then I'll pick up any issue, figure it out, and then make a pull
>>> request by myself ;-)
>>>
>>> Meanwhile, I also read the roadmap and I find few plans capturing my interest.
>>> - Making YARN resource dynamic
>>> - DataSet API Enhancements
>>> - Expose more runtime metrics
>>> Would any of you informs me of new or existing issues regarding the above?
>>>
>>> Thanks!
>>>
>>> Dongwon
>>>
>>> 2016-02-06 4:55 GMT+09:00 Fabian Hueske <fh...@gmail.com>:
>>>> Hi Dongwon,
>>>>
>>>> welcome to the Flink mailing list!
>>>> What kind of issues are you interested in?
>>>>
>>>> - API / library features: DataSet API, DataStream API, SQL, StreamSQL,
>>>> Graphs (Gelly)
>>>> - Processing runtime: Batch, Streaming
>>>> - Connectors to other systems: Stream sources/sinks
>>>> - Web dashboard
>>>> - Compatibility: Storm, Hadoop
>>>>
>>>> You can also have a look into Flink's issue tracker JIRA [1]. Right now, we
>>>> have about 600 issues listed with any kind of difficulty and effort.
>>>> If you find an issue that sounds interesting, just drop a note and we can
>>>> give you some details about if you want to learn more.
>>>>
>>>> Best, Fabian
>>>>
>>>> [1]
>>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved
>>>>
>>>> 2016-02-05 17:14 GMT+01:00 Dongwon Kim <ea...@postech.ac.kr>:
>>>>
>>>>> Hello,
>>>>>
>>>>> I'm Dongwon Kim and I want to get involved in Flink community.
>>>>> Can anyone guide me through contributing to Flink with some startup issues?
>>>>> Although my research interest lie in big data systems including Flink,
>>>>> Spark, MapReduce, and Tez, I've never participated in open source
>>>>> communities.
>>>>>
>>>>> FYI, I've done the following things for past few years:
>>>>> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and
>>>>> Apache Spark through the source code.
>>>>> - My doctoral thesis is about improving the performance of MRv1 by
>>>>> making network pipelines between mappers and reducers like what Flink
>>>>> does.
>>>>> - I've used Ganglia to monitor the cluster performance and I've been
>>>>> interested in metrics and counters in big data systems.
>>>>> - I gave a talk named "a comparative performance evaluation of Flink"
>>>>> at last Flink Forward.
>>>>>
>>>>> I would be very appreciated if someone can help me get involved in the
>>>>> most promising ASF project :-)
>>>>>
>>>>> Greetings,
>>>>> Dongwon Kim
>>>>>
>>

Re: Want Flink startup issues :-)

Posted by Dongwon Kim <ea...@postech.ac.kr>.

Hi Chiwan!

That's what I wanted to know!
Thanks!

Dongwon Kim

2016-02-06 22:00 GMT+09:00 Chiwan Park <ch...@apache.org>:
> Hi Dongwon,
>
> Yes, the things to do are picking an issue (by assigning the issue to you or commenting on the issue) and make changes and send a pull request for it.
>
> Welcome! :)
>
> Regards,
> Chiwan Park
>
>> On Feb 6, 2016, at 3:31 PM, Dongwon Kim <ea...@postech.ac.kr> wrote:
>>
>> Hi Fabian, Matthias, Robert!
>>
>> Thank you for welcoming me to the community :-)
>> I'm taking a look at JIRA and "How to contribute" as you guys suggested.
>> One trivial question is whether I just need to make a pull request
>> after figuring out issues?
>> Then I'll pick up any issue, figure it out, and then make a pull
>> request by myself ;-)
>>
>> Meanwhile, I also read the roadmap and I find few plans capturing my interest.
>> - Making YARN resource dynamic
>> - DataSet API Enhancements
>> - Expose more runtime metrics
>> Would any of you informs me of new or existing issues regarding the above?
>>
>> Thanks!
>>
>> Dongwon
>>
>> 2016-02-06 4:55 GMT+09:00 Fabian Hueske <fh...@gmail.com>:
>>> Hi Dongwon,
>>>
>>> welcome to the Flink mailing list!
>>> What kind of issues are you interested in?
>>>
>>> - API / library features: DataSet API, DataStream API, SQL, StreamSQL,
>>> Graphs (Gelly)
>>> - Processing runtime: Batch, Streaming
>>> - Connectors to other systems: Stream sources/sinks
>>> - Web dashboard
>>> - Compatibility: Storm, Hadoop
>>>
>>> You can also have a look into Flink's issue tracker JIRA [1]. Right now, we
>>> have about 600 issues listed with any kind of difficulty and effort.
>>> If you find an issue that sounds interesting, just drop a note and we can
>>> give you some details about if you want to learn more.
>>>
>>> Best, Fabian
>>>
>>> [1]
>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved
>>>
>>> 2016-02-05 17:14 GMT+01:00 Dongwon Kim <ea...@postech.ac.kr>:
>>>
>>>> Hello,
>>>>
>>>> I'm Dongwon Kim and I want to get involved in Flink community.
>>>> Can anyone guide me through contributing to Flink with some startup issues?
>>>> Although my research interest lie in big data systems including Flink,
>>>> Spark, MapReduce, and Tez, I've never participated in open source
>>>> communities.
>>>>
>>>> FYI, I've done the following things for past few years:
>>>> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and
>>>> Apache Spark through the source code.
>>>> - My doctoral thesis is about improving the performance of MRv1 by
>>>> making network pipelines between mappers and reducers like what Flink
>>>> does.
>>>> - I've used Ganglia to monitor the cluster performance and I've been
>>>> interested in metrics and counters in big data systems.
>>>> - I gave a talk named "a comparative performance evaluation of Flink"
>>>> at last Flink Forward.
>>>>
>>>> I would be very appreciated if someone can help me get involved in the
>>>> most promising ASF project :-)
>>>>
>>>> Greetings,
>>>> Dongwon Kim
>>>>
>

Re: Want Flink startup issues :-)

Posted by Chiwan Park <ch...@apache.org>.

Hi Dongwon,

Yes, the things to do are picking an issue (by assigning the issue to you or commenting on the issue) and make changes and send a pull request for it.

Welcome! :)

Regards,
Chiwan Park

> On Feb 6, 2016, at 3:31 PM, Dongwon Kim <ea...@postech.ac.kr> wrote:
> 
> Hi Fabian, Matthias, Robert!
> 
> Thank you for welcoming me to the community :-)
> I'm taking a look at JIRA and "How to contribute" as you guys suggested.
> One trivial question is whether I just need to make a pull request
> after figuring out issues?
> Then I'll pick up any issue, figure it out, and then make a pull
> request by myself ;-)
> 
> Meanwhile, I also read the roadmap and I find few plans capturing my interest.
> - Making YARN resource dynamic
> - DataSet API Enhancements
> - Expose more runtime metrics
> Would any of you informs me of new or existing issues regarding the above?
> 
> Thanks!
> 
> Dongwon
> 
> 2016-02-06 4:55 GMT+09:00 Fabian Hueske <fh...@gmail.com>:
>> Hi Dongwon,
>> 
>> welcome to the Flink mailing list!
>> What kind of issues are you interested in?
>> 
>> - API / library features: DataSet API, DataStream API, SQL, StreamSQL,
>> Graphs (Gelly)
>> - Processing runtime: Batch, Streaming
>> - Connectors to other systems: Stream sources/sinks
>> - Web dashboard
>> - Compatibility: Storm, Hadoop
>> 
>> You can also have a look into Flink's issue tracker JIRA [1]. Right now, we
>> have about 600 issues listed with any kind of difficulty and effort.
>> If you find an issue that sounds interesting, just drop a note and we can
>> give you some details about if you want to learn more.
>> 
>> Best, Fabian
>> 
>> [1]
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved
>> 
>> 2016-02-05 17:14 GMT+01:00 Dongwon Kim <ea...@postech.ac.kr>:
>> 
>>> Hello,
>>> 
>>> I'm Dongwon Kim and I want to get involved in Flink community.
>>> Can anyone guide me through contributing to Flink with some startup issues?
>>> Although my research interest lie in big data systems including Flink,
>>> Spark, MapReduce, and Tez, I've never participated in open source
>>> communities.
>>> 
>>> FYI, I've done the following things for past few years:
>>> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and
>>> Apache Spark through the source code.
>>> - My doctoral thesis is about improving the performance of MRv1 by
>>> making network pipelines between mappers and reducers like what Flink
>>> does.
>>> - I've used Ganglia to monitor the cluster performance and I've been
>>> interested in metrics and counters in big data systems.
>>> - I gave a talk named "a comparative performance evaluation of Flink"
>>> at last Flink Forward.
>>> 
>>> I would be very appreciated if someone can help me get involved in the
>>> most promising ASF project :-)
>>> 
>>> Greetings,
>>> Dongwon Kim
>>>

Re: Want Flink startup issues :-)

Posted by Dongwon Kim <ea...@postech.ac.kr>.

Hi Fabian, Matthias, Robert!

Thank you for welcoming me to the community :-)
I'm taking a look at JIRA and "How to contribute" as you guys suggested.
One trivial question is whether I just need to make a pull request
after figuring out issues?
Then I'll pick up any issue, figure it out, and then make a pull
request by myself ;-)

Meanwhile, I also read the roadmap and I find few plans capturing my interest.
- Making YARN resource dynamic
- DataSet API Enhancements
- Expose more runtime metrics
Would any of you informs me of new or existing issues regarding the above?

Thanks!

Dongwon

2016-02-06 4:55 GMT+09:00 Fabian Hueske <fh...@gmail.com>:
> Hi Dongwon,
>
> welcome to the Flink mailing list!
> What kind of issues are you interested in?
>
> - API / library features: DataSet API, DataStream API, SQL, StreamSQL,
> Graphs (Gelly)
> - Processing runtime: Batch, Streaming
> - Connectors to other systems: Stream sources/sinks
> - Web dashboard
> - Compatibility: Storm, Hadoop
>
> You can also have a look into Flink's issue tracker JIRA [1]. Right now, we
> have about 600 issues listed with any kind of difficulty and effort.
> If you find an issue that sounds interesting, just drop a note and we can
> give you some details about if you want to learn more.
>
> Best, Fabian
>
> [1]
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved
>
> 2016-02-05 17:14 GMT+01:00 Dongwon Kim <ea...@postech.ac.kr>:
>
>> Hello,
>>
>> I'm Dongwon Kim and I want to get involved in Flink community.
>> Can anyone guide me through contributing to Flink with some startup issues?
>> Although my research interest lie in big data systems including Flink,
>> Spark, MapReduce, and Tez, I've never participated in open source
>> communities.
>>
>> FYI, I've done the following things for past few years:
>> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and
>> Apache Spark through the source code.
>> - My doctoral thesis is about improving the performance of MRv1 by
>> making network pipelines between mappers and reducers like what Flink
>> does.
>> - I've used Ganglia to monitor the cluster performance and I've been
>> interested in metrics and counters in big data systems.
>> - I gave a talk named "a comparative performance evaluation of Flink"
>> at last Flink Forward.
>>
>> I would be very appreciated if someone can help me get involved in the
>> most promising ASF project :-)
>>
>> Greetings,
>> Dongwon Kim
>>

Re: Want Flink startup issues :-)

Posted by "Matthias J. Sax" <mj...@apache.org>.

Hi Dongwon,

very cool that you decided to join the community. Btw: very nice talk at
Flink Forward!

Fabian pointed out the most important things already.

On more thing I wanted to add (just in case you are not aware of it
already). There is a "How to contribute" section on the Flink web page:
https://flink.apache.org/how-to-contribute.html

This should also help to get you started. Looking forward to your first
pull request!

-Matthias


On 02/05/2016 08:55 PM, Fabian Hueske wrote:
> Hi Dongwon,
> 
> welcome to the Flink mailing list!
> What kind of issues are you interested in?
> 
> - API / library features: DataSet API, DataStream API, SQL, StreamSQL,
> Graphs (Gelly)
> - Processing runtime: Batch, Streaming
> - Connectors to other systems: Stream sources/sinks
> - Web dashboard
> - Compatibility: Storm, Hadoop
> 
> You can also have a look into Flink's issue tracker JIRA [1]. Right now, we
> have about 600 issues listed with any kind of difficulty and effort.
> If you find an issue that sounds interesting, just drop a note and we can
> give you some details about if you want to learn more.
> 
> Best, Fabian
> 
> [1]
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved
> 
> 2016-02-05 17:14 GMT+01:00 Dongwon Kim <ea...@postech.ac.kr>:
> 
>> Hello,
>>
>> I'm Dongwon Kim and I want to get involved in Flink community.
>> Can anyone guide me through contributing to Flink with some startup issues?
>> Although my research interest lie in big data systems including Flink,
>> Spark, MapReduce, and Tez, I've never participated in open source
>> communities.
>>
>> FYI, I've done the following things for past few years:
>> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and
>> Apache Spark through the source code.
>> - My doctoral thesis is about improving the performance of MRv1 by
>> making network pipelines between mappers and reducers like what Flink
>> does.
>> - I've used Ganglia to monitor the cluster performance and I've been
>> interested in metrics and counters in big data systems.
>> - I gave a talk named "a comparative performance evaluation of Flink"
>> at last Flink Forward.
>>
>> I would be very appreciated if someone can help me get involved in the
>> most promising ASF project :-)
>>
>> Greetings,
>> Dongwon Kim
>>
>

Re: Want Flink startup issues :-)

Posted by Robert Metzger <rm...@apache.org>.

Hi Dongwon Kim,

its great to see you here. I really enjoyed your talk at Flink Forward, you
did very good and detailed research on the different systems! (Those who
didn't see the talk: go watch it on YouTube).

Maybe you are interested in working on improving our monitoring / metrics
system. People have asked for ways to expose them to other systems (via
JMX, Ganglia, ...). There were also questions about writing the metrics to
disk (in a csv file or so).


On Fri, Feb 5, 2016 at 8:55 PM, Fabian Hueske <fh...@gmail.com> wrote:

> Hi Dongwon,
>
> welcome to the Flink mailing list!
> What kind of issues are you interested in?
>
> - API / library features: DataSet API, DataStream API, SQL, StreamSQL,
> Graphs (Gelly)
> - Processing runtime: Batch, Streaming
> - Connectors to other systems: Stream sources/sinks
> - Web dashboard
> - Compatibility: Storm, Hadoop
>
> You can also have a look into Flink's issue tracker JIRA [1]. Right now, we
> have about 600 issues listed with any kind of difficulty and effort.
> If you find an issue that sounds interesting, just drop a note and we can
> give you some details about if you want to learn more.
>
> Best, Fabian
>
> [1]
>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved
>
> 2016-02-05 17:14 GMT+01:00 Dongwon Kim <ea...@postech.ac.kr>:
>
> > Hello,
> >
> > I'm Dongwon Kim and I want to get involved in Flink community.
> > Can anyone guide me through contributing to Flink with some startup
> issues?
> > Although my research interest lie in big data systems including Flink,
> > Spark, MapReduce, and Tez, I've never participated in open source
> > communities.
> >
> > FYI, I've done the following things for past few years:
> > - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and
> > Apache Spark through the source code.
> > - My doctoral thesis is about improving the performance of MRv1 by
> > making network pipelines between mappers and reducers like what Flink
> > does.
> > - I've used Ganglia to monitor the cluster performance and I've been
> > interested in metrics and counters in big data systems.
> > - I gave a talk named "a comparative performance evaluation of Flink"
> > at last Flink Forward.
> >
> > I would be very appreciated if someone can help me get involved in the
> > most promising ASF project :-)
> >
> > Greetings,
> > Dongwon Kim
> >
>

Re: Want Flink startup issues :-)

Posted by Fabian Hueske <fh...@gmail.com>.

Hi Dongwon,

welcome to the Flink mailing list!
What kind of issues are you interested in?

- API / library features: DataSet API, DataStream API, SQL, StreamSQL,
Graphs (Gelly)
- Processing runtime: Batch, Streaming
- Connectors to other systems: Stream sources/sinks
- Web dashboard
- Compatibility: Storm, Hadoop

You can also have a look into Flink's issue tracker JIRA [1]. Right now, we
have about 600 issues listed with any kind of difficulty and effort.
If you find an issue that sounds interesting, just drop a note and we can
give you some details about if you want to learn more.

Best, Fabian

[1]
https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved

2016-02-05 17:14 GMT+01:00 Dongwon Kim <ea...@postech.ac.kr>:

> Hello,
>
> I'm Dongwon Kim and I want to get involved in Flink community.
> Can anyone guide me through contributing to Flink with some startup issues?
> Although my research interest lie in big data systems including Flink,
> Spark, MapReduce, and Tez, I've never participated in open source
> communities.
>
> FYI, I've done the following things for past few years:
> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and
> Apache Spark through the source code.
> - My doctoral thesis is about improving the performance of MRv1 by
> making network pipelines between mappers and reducers like what Flink
> does.
> - I've used Ganglia to monitor the cluster performance and I've been
> interested in metrics and counters in big data systems.
> - I gave a talk named "a comparative performance evaluation of Flink"
> at last Flink Forward.
>
> I would be very appreciated if someone can help me get involved in the
> most promising ASF project :-)
>
> Greetings,
> Dongwon Kim
>