You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Till Rohrmann <tr...@apache.org> on 2018/03/19 18:06:11 UTC

[ANNOUNCE] Weekly community update #12

Dear community,

I've noticed that Flink has grown quite a bit in the past. As a consequence
it can be quite challenging to stay up to date. Especially for community
members who don't follow Flink's MLs on a daily basis.

In order to keep a bigger part of the community in the loop, I wanted to
try out a weekly update letter where I update the community with what
happened from my perspective. Since I also don't know everything I want to
encourage others to post updates about things they deem important and
relevant for the community to this thread.

# Weekly update #12:

## Flink 1.5 release:
- The Flink community is still working on the Flink 1.5 release. Hopefully
Flink 1.5 can be released in the next weeks.
- The main work concentrated last week on stabilizing Flip-6 and adding
more automated tests [1]. The Flink community appreciates every helping
hand with adding more end to end tests.
- Consequently, the committed changes mainly consisted of bug fixes and
test hardening.
- By the end of this week, we hope to have a RC ready which can be used for
easier release testing. Given the big changes (network stack and Flip-6),
the RC will most likely still contain some rough edges. In order to smooth
them out, it would be good if we run Flink 1.5 in as many different
scenarios as possible.

## Flink 1.3.3. has been released
- Flink 1.3.3 containing an important fix for properly handling checkpoints
in case of a DFS problem has been released. We highly recommend that all
users running Flink 1.3.2 upgrade swiftly to Flink 1.3.3.

## Misc:
- Shuyi opened a discussion about improving Flink's security [2]. If you
are interested and want to help with the next steps please engage in the
discussion.

PS: Don't worry that you've missed the first 11 weekly community updates.
It's just this week's number.

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Flink-1-5-release-testing-effort-td21646.html
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html

Cheers,
Till

Re: [ANNOUNCE] Weekly community update #12

Posted by Till Rohrmann <tr...@apache.org>.
Eron pointed out to me that the Flink improvement proposal 6 (short Flip-6)
deserves some more comments since not everyone will be aware of what it
actually means. I totally agree and will try to give a bit more context for
everyone interested.

Flip-6 is intended to solve some of the Flink's shortcomings with respect
to resource management and improve its deployment flexibility. We have seen
in the past that Flink's legacy abstraction was not well suited to support
an ever increasing set of different deployments.

Flink started with the standalone mode which runs Flink on a bare-metal
cluster. Soon it became evident that people would like to run Flink on top
of cluster resource managers such as Yarn or Mesos. Consequently, support
for Yarn was added. Until then, there was only a single execution mode
which is now called the session mode. The session mode allows you to run
multiple jobs on the same Flink cluster at the cost of no resource
isolation. With Yarn we added a new per-job mode which starts a Flink
cluster for each job and gives you resource isolation. The next step was
the integration with Mesos and now many people want to run Flink in a
containerized environment (Docker and Kubernetes).

On top of that, a much sought after feature since quite some time is that
Flink should be able to dynamically allocate more resources in order to
scale jobs up and release resources if they are not used to capacity. That
way one won't waste resources or under provision the Flink cluster if
facing changing workloads.

Since Flink has grown over time and some of the requirements weren't clear
from the very beginning, it seemed quite difficult to make Flink work in
all the different settings with support for dynamic scaling. So in order to
make Flink future proof deployment-wise and adding support for full
resource elasticity the community started the Flip-6 effort.

Flip-6 split the existing architecture up into 4 components: JobMaster,
TaskExecutor, ResourceManager and Dispatcher. The JobMaster is now
responsible for running a single job. The TaskExecutor remained more or
less the same and is responsible for executing tasks which the JobMaster
deploys to it. The ResourceManager is the integration component with an
external system like Yarn and Mesos. Its task is to allocate new
containers/tasks to spawn new TaskExecutors if need be. The Dispatcher is
the component responsible for receiving new jobs and spawning a new
JobMaster to execute them.

The idea now is to use these building blocks to implement the session as
well as the per-job mode in the different deployment scenarios.

Flink 1.5 will run per default on the new Flip-6 architecture and supports
Yarn, Mesos as well as the standalone mode. Thus, it supports the same
deployment mode which Flink 1.4 supported. Additionally, it should now be
easier to run Flink in a containerized environment since the client now
communicates via REST calls with the Flink cluster. On Yarn and Mesos it
will also allow to dynamically allocate and free resources which enables
rescaling of jobs.

The next logical step would be to provide a better K8 integration which
allows K8 to add and remove pods which are then automatically used by Flink.

For more information you can take a look at [1] which gives an overview
about the architecture or simply reach out to me.

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077

Cheers,
Till

On Tue, Mar 20, 2018 at 9:35 PM, Stephan Ewen <se...@apache.org> wrote:

> Great initiative, highly appreciated, Till!
>
>
> On Mon, Mar 19, 2018 at 7:06 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
> > Dear community,
> >
> > I've noticed that Flink has grown quite a bit in the past. As a
> > consequence it can be quite challenging to stay up to date. Especially
> for
> > community members who don't follow Flink's MLs on a daily basis.
> >
> > In order to keep a bigger part of the community in the loop, I wanted to
> > try out a weekly update letter where I update the community with what
> > happened from my perspective. Since I also don't know everything I want
> to
> > encourage others to post updates about things they deem important and
> > relevant for the community to this thread.
> >
> > # Weekly update #12:
> >
> > ## Flink 1.5 release:
> > - The Flink community is still working on the Flink 1.5 release.
> Hopefully
> > Flink 1.5 can be released in the next weeks.
> > - The main work concentrated last week on stabilizing Flip-6 and adding
> > more automated tests [1]. The Flink community appreciates every helping
> > hand with adding more end to end tests.
> > - Consequently, the committed changes mainly consisted of bug fixes and
> > test hardening.
> > - By the end of this week, we hope to have a RC ready which can be used
> > for easier release testing. Given the big changes (network stack and
> > Flip-6), the RC will most likely still contain some rough edges. In order
> > to smooth them out, it would be good if we run Flink 1.5 in as many
> > different scenarios as possible.
> >
> > ## Flink 1.3.3. has been released
> > - Flink 1.3.3 containing an important fix for properly handling
> > checkpoints in case of a DFS problem has been released. We highly
> recommend
> > that all users running Flink 1.3.2 upgrade swiftly to Flink 1.3.3.
> >
> > ## Misc:
> > - Shuyi opened a discussion about improving Flink's security [2]. If you
> > are interested and want to help with the next steps please engage in the
> > discussion.
> >
> > PS: Don't worry that you've missed the first 11 weekly community updates.
> > It's just this week's number.
> >
> > [1] http://apache-flink-mailing-list-archive.1008284.n3.
> > nabble.com/ANNOUNCE-Flink-1-5-release-testing-effort-td21646.html
> > [2] http://apache-flink-mailing-list-archive.1008284.
> > n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html
> >
> > Cheers,
> > Till
> >
>

Re: [ANNOUNCE] Weekly community update #12

Posted by Till Rohrmann <tr...@apache.org>.
Eron pointed out to me that the Flink improvement proposal 6 (short Flip-6)
deserves some more comments since not everyone will be aware of what it
actually means. I totally agree and will try to give a bit more context for
everyone interested.

Flip-6 is intended to solve some of the Flink's shortcomings with respect
to resource management and improve its deployment flexibility. We have seen
in the past that Flink's legacy abstraction was not well suited to support
an ever increasing set of different deployments.

Flink started with the standalone mode which runs Flink on a bare-metal
cluster. Soon it became evident that people would like to run Flink on top
of cluster resource managers such as Yarn or Mesos. Consequently, support
for Yarn was added. Until then, there was only a single execution mode
which is now called the session mode. The session mode allows you to run
multiple jobs on the same Flink cluster at the cost of no resource
isolation. With Yarn we added a new per-job mode which starts a Flink
cluster for each job and gives you resource isolation. The next step was
the integration with Mesos and now many people want to run Flink in a
containerized environment (Docker and Kubernetes).

On top of that, a much sought after feature since quite some time is that
Flink should be able to dynamically allocate more resources in order to
scale jobs up and release resources if they are not used to capacity. That
way one won't waste resources or under provision the Flink cluster if
facing changing workloads.

Since Flink has grown over time and some of the requirements weren't clear
from the very beginning, it seemed quite difficult to make Flink work in
all the different settings with support for dynamic scaling. So in order to
make Flink future proof deployment-wise and adding support for full
resource elasticity the community started the Flip-6 effort.

Flip-6 split the existing architecture up into 4 components: JobMaster,
TaskExecutor, ResourceManager and Dispatcher. The JobMaster is now
responsible for running a single job. The TaskExecutor remained more or
less the same and is responsible for executing tasks which the JobMaster
deploys to it. The ResourceManager is the integration component with an
external system like Yarn and Mesos. Its task is to allocate new
containers/tasks to spawn new TaskExecutors if need be. The Dispatcher is
the component responsible for receiving new jobs and spawning a new
JobMaster to execute them.

The idea now is to use these building blocks to implement the session as
well as the per-job mode in the different deployment scenarios.

Flink 1.5 will run per default on the new Flip-6 architecture and supports
Yarn, Mesos as well as the standalone mode. Thus, it supports the same
deployment mode which Flink 1.4 supported. Additionally, it should now be
easier to run Flink in a containerized environment since the client now
communicates via REST calls with the Flink cluster. On Yarn and Mesos it
will also allow to dynamically allocate and free resources which enables
rescaling of jobs.

The next logical step would be to provide a better K8 integration which
allows K8 to add and remove pods which are then automatically used by Flink.

For more information you can take a look at [1] which gives an overview
about the architecture or simply reach out to me.

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077

Cheers,
Till

On Tue, Mar 20, 2018 at 9:35 PM, Stephan Ewen <se...@apache.org> wrote:

> Great initiative, highly appreciated, Till!
>
>
> On Mon, Mar 19, 2018 at 7:06 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
> > Dear community,
> >
> > I've noticed that Flink has grown quite a bit in the past. As a
> > consequence it can be quite challenging to stay up to date. Especially
> for
> > community members who don't follow Flink's MLs on a daily basis.
> >
> > In order to keep a bigger part of the community in the loop, I wanted to
> > try out a weekly update letter where I update the community with what
> > happened from my perspective. Since I also don't know everything I want
> to
> > encourage others to post updates about things they deem important and
> > relevant for the community to this thread.
> >
> > # Weekly update #12:
> >
> > ## Flink 1.5 release:
> > - The Flink community is still working on the Flink 1.5 release.
> Hopefully
> > Flink 1.5 can be released in the next weeks.
> > - The main work concentrated last week on stabilizing Flip-6 and adding
> > more automated tests [1]. The Flink community appreciates every helping
> > hand with adding more end to end tests.
> > - Consequently, the committed changes mainly consisted of bug fixes and
> > test hardening.
> > - By the end of this week, we hope to have a RC ready which can be used
> > for easier release testing. Given the big changes (network stack and
> > Flip-6), the RC will most likely still contain some rough edges. In order
> > to smooth them out, it would be good if we run Flink 1.5 in as many
> > different scenarios as possible.
> >
> > ## Flink 1.3.3. has been released
> > - Flink 1.3.3 containing an important fix for properly handling
> > checkpoints in case of a DFS problem has been released. We highly
> recommend
> > that all users running Flink 1.3.2 upgrade swiftly to Flink 1.3.3.
> >
> > ## Misc:
> > - Shuyi opened a discussion about improving Flink's security [2]. If you
> > are interested and want to help with the next steps please engage in the
> > discussion.
> >
> > PS: Don't worry that you've missed the first 11 weekly community updates.
> > It's just this week's number.
> >
> > [1] http://apache-flink-mailing-list-archive.1008284.n3.
> > nabble.com/ANNOUNCE-Flink-1-5-release-testing-effort-td21646.html
> > [2] http://apache-flink-mailing-list-archive.1008284.
> > n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html
> >
> > Cheers,
> > Till
> >
>

Re: [ANNOUNCE] Weekly community update #12

Posted by Stephan Ewen <se...@apache.org>.
Great initiative, highly appreciated, Till!


On Mon, Mar 19, 2018 at 7:06 PM, Till Rohrmann <tr...@apache.org> wrote:

> Dear community,
>
> I've noticed that Flink has grown quite a bit in the past. As a
> consequence it can be quite challenging to stay up to date. Especially for
> community members who don't follow Flink's MLs on a daily basis.
>
> In order to keep a bigger part of the community in the loop, I wanted to
> try out a weekly update letter where I update the community with what
> happened from my perspective. Since I also don't know everything I want to
> encourage others to post updates about things they deem important and
> relevant for the community to this thread.
>
> # Weekly update #12:
>
> ## Flink 1.5 release:
> - The Flink community is still working on the Flink 1.5 release. Hopefully
> Flink 1.5 can be released in the next weeks.
> - The main work concentrated last week on stabilizing Flip-6 and adding
> more automated tests [1]. The Flink community appreciates every helping
> hand with adding more end to end tests.
> - Consequently, the committed changes mainly consisted of bug fixes and
> test hardening.
> - By the end of this week, we hope to have a RC ready which can be used
> for easier release testing. Given the big changes (network stack and
> Flip-6), the RC will most likely still contain some rough edges. In order
> to smooth them out, it would be good if we run Flink 1.5 in as many
> different scenarios as possible.
>
> ## Flink 1.3.3. has been released
> - Flink 1.3.3 containing an important fix for properly handling
> checkpoints in case of a DFS problem has been released. We highly recommend
> that all users running Flink 1.3.2 upgrade swiftly to Flink 1.3.3.
>
> ## Misc:
> - Shuyi opened a discussion about improving Flink's security [2]. If you
> are interested and want to help with the next steps please engage in the
> discussion.
>
> PS: Don't worry that you've missed the first 11 weekly community updates.
> It's just this week's number.
>
> [1] http://apache-flink-mailing-list-archive.1008284.n3.
> nabble.com/ANNOUNCE-Flink-1-5-release-testing-effort-td21646.html
> [2] http://apache-flink-mailing-list-archive.1008284.
> n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html
>
> Cheers,
> Till
>

Re: [ANNOUNCE] Weekly community update #12

Posted by Stephan Ewen <se...@apache.org>.
Great initiative, highly appreciated, Till!


On Mon, Mar 19, 2018 at 7:06 PM, Till Rohrmann <tr...@apache.org> wrote:

> Dear community,
>
> I've noticed that Flink has grown quite a bit in the past. As a
> consequence it can be quite challenging to stay up to date. Especially for
> community members who don't follow Flink's MLs on a daily basis.
>
> In order to keep a bigger part of the community in the loop, I wanted to
> try out a weekly update letter where I update the community with what
> happened from my perspective. Since I also don't know everything I want to
> encourage others to post updates about things they deem important and
> relevant for the community to this thread.
>
> # Weekly update #12:
>
> ## Flink 1.5 release:
> - The Flink community is still working on the Flink 1.5 release. Hopefully
> Flink 1.5 can be released in the next weeks.
> - The main work concentrated last week on stabilizing Flip-6 and adding
> more automated tests [1]. The Flink community appreciates every helping
> hand with adding more end to end tests.
> - Consequently, the committed changes mainly consisted of bug fixes and
> test hardening.
> - By the end of this week, we hope to have a RC ready which can be used
> for easier release testing. Given the big changes (network stack and
> Flip-6), the RC will most likely still contain some rough edges. In order
> to smooth them out, it would be good if we run Flink 1.5 in as many
> different scenarios as possible.
>
> ## Flink 1.3.3. has been released
> - Flink 1.3.3 containing an important fix for properly handling
> checkpoints in case of a DFS problem has been released. We highly recommend
> that all users running Flink 1.3.2 upgrade swiftly to Flink 1.3.3.
>
> ## Misc:
> - Shuyi opened a discussion about improving Flink's security [2]. If you
> are interested and want to help with the next steps please engage in the
> discussion.
>
> PS: Don't worry that you've missed the first 11 weekly community updates.
> It's just this week's number.
>
> [1] http://apache-flink-mailing-list-archive.1008284.n3.
> nabble.com/ANNOUNCE-Flink-1-5-release-testing-effort-td21646.html
> [2] http://apache-flink-mailing-list-archive.1008284.
> n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html
>
> Cheers,
> Till
>